I have a problem. I am using ArangoDB enterprise:3.8.6 via Docker. But unfortunately my query takes longer than 30s.
When it fails, the error is arangodb HTTPConnectionPool(host='127.0.0.1', port=8529): Read timed out. (read timeout=60).
- My collection is aroung 4GB huge and ~ 1.2 mio - 900k documents inside the collection.
How could I get the complete collection with all documents without any error?
Python code (runs locally on my machine)
from arango import ArangoClient
# Initialize the ArangoDB client.
client = ArangoClient()
# Connect to database as  user.
db = client.db(<db>, username=<username>, password=<password>)
cursor = db.aql.execute(f'FOR doc IN students RETURN doc', batch_size=10000)
result = [doc for doc in cursor]
print(result[0])
[OUT]
arangodb HTTPConnectionPool(host='127.0.0.1', port=8529): Read timed out. (read timeout=60)
docker-compose.yml for ArangoDB
version: '3.7'
services:
  database:
    container_name: database__arangodb
    image: arangodb/enterprise:3.8.6
    environment:
      - ARANGO_LICENSE_KEY=<key>
      - ARANGO_ROOT_PASSWORD=root
      - ARANGO_CONNECT_TIMEOUT=300
      - ARANGO_READ_TIMEOUT=600
    ports:
      - 8529:8529
    volumes:
      - C:/Users/dataset:/var/lib/arangodb3
What I tried
cursor = db.aql.execute('FOR doc IN <Collection> RETURN doc', stream=True)
while cursor.has_more(): # Fetch until nothing is left on the server.
    cursor.fetch()
while not cursor.empty(): # Pop until nothing is left on the cursor.
    cursor.pop()
[OUT] CursorNextError: [HTTP 404][ERR 1600] cursor not found
# A N D 
cursor = db.aql.execute('FOR doc IN <Collection> RETURN doc', stream=True, ttl=3600)
collection =  [doc for doc in cursor]
[OUT] nothing # Runs, runs and runs for more than 1 1/2 hours
What worked but only for 100 documents
# And that worked
cursor = db.aql.execute(f'FOR doc IN <Collection> LIMIT 100 RETURN doc', stream=True)
collection =  [doc for doc in cursor]
 
                        
You can increase the HTTP client's timeout by using a custom HTTP client for Arango.
The default is set here to 60 seconds.