I am trying to spark-submit using Amazon ec2 with the following:
spark-submit --packages org.apache.hadoop:hadoop-aws:2.7.1 --master spark://amazonaws.com SimpleApp.py
and I end up with the following error. It seems to be that it is looking for hadoop. My ec2 cluster was created using spark-ec2 command.
Ivy Default Cache set to: /home/adas/.ivy2/cache
The jars for the packages stored in: /home/adas/.ivy2/jars
:: loading settings :: url = jar:file:/home/adas/spark/spark-2.1.0-bin-hadoop2.7/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
org.apache.hadoop#hadoop-aws added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
confs: [default]
:: resolution report :: resolve 66439ms :: artifacts dl 0ms
:: modules in use:
---------------------------------------------------------------------
| | modules || artifacts |
| conf | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
| default | 1 | 0 | 0 | 0 || 0 | 0 |
---------------------------------------------------------------------
:: problems summary ::
:::: WARNINGS
module not found: org.apache.hadoop#hadoop-aws;2.7.1
==== local-m2-cache: tried
file:/home/adas/.m2/repository/org/apache/hadoop/hadoop-aws/2.7.1/hadoop-aws-2.7.1.pom
-- artifact org.apache.hadoop#hadoop-aws;2.7.1!hadoop-aws.jar:
file:/home/adas/.m2/repository/org/apache/hadoop/hadoop-aws/2.7.1/hadoop-aws-2.7.1.jar
==== local-ivy-cache: tried
/home/adas/.ivy2/local/org.apache.hadoop/hadoop-aws/2.7.1/ivys/ivy.xml
-- artifact org.apache.hadoop#hadoop-aws;2.7.1!hadoop-aws.jar:
/home/adas/.ivy2/local/org.apache.hadoop/hadoop-aws/2.7.1/jars/hadoop-aws.jar
==== central: tried
https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/2.7.1/hadoop-aws-2.7.1.pom
-- artifact org.apache.hadoop#hadoop-aws;2.7.1!hadoop-aws.jar:
https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/2.7.1/hadoop-aws-2.7.1.jar
==== spark-packages: tried
http://dl.bintray.com/spark-packages/maven/org/apache/hadoop/hadoop-aws/2.7.1/hadoop-aws-2.7.1.pom
-- artifact org.apache.hadoop#hadoop-aws;2.7.1!hadoop-aws.jar:
http://dl.bintray.com/spark-packages/maven/org/apache/hadoop/hadoop-aws/2.7.1/hadoop-aws-2.7.1.jar
::::::::::::::::::::::::::::::::::::::::::::::
:: UNRESOLVED DEPENDENCIES ::
::::::::::::::::::::::::::::::::::::::::::::::
:: org.apache.hadoop#hadoop-aws;2.7.1: not found
::::::::::::::::::::::::::::::::::::::::::::::
:::: ERRORS
Server access error at url https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/2.7.1/hadoop-aws-2.7.1.pom (java.net.NoRouteToHostException: No route to host (Host unreachable))
Server access error at url https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/2.7.1/hadoop-aws-2.7.1.jar (java.net.NoRouteToHostException: No route to host (Host unreachable))
Server access error at url http://dl.bintray.com/spark-packages/maven/org/apache/hadoop/hadoop-aws/2.7.1/hadoop-aws-2.7.1.pom (java.net.NoRouteToHostException: No route to host (Host unreachable))
Server access error at url http://dl.bintray.com/spark-packages/maven/org/apache/hadoop/hadoop-aws/2.7.1/hadoop-aws-2.7.1.jar (java.net.NoRouteToHostException: No route to host (Host unreachable))
:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: org.apache.hadoop#hadoop-aws;2.7.1: not found]
at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1078)
at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:296)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:160)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Sorry everyone, it was just some local proxy issues.