SeaweedFS - Added new volume server but not able to add new files

813 views Asked by At

I've one master(x.x.x.61), one volume(x.x.x.63) and one filer + s3API (x.x.x.62) setup on 3 separate machines. I added a new volume server (x.x.x.64) because I've max out the storage space on the first volume server. But I'm still not able to add new files on the filer UI(http://x.x.x.62:8888)

In my filer logs, I noticed that it's trying to connect to the first volume server IP address that's out of space. Am I missing a configuration for it to connect to the new volume server?

E1221 11:09:48.027930 upload_content.go:351 unmarshal http://x.x.x.63:8080/7,2bafadaa4666: {"error":"failed to write to local disk: write data/chrisDir_7.dat: no space left on device"}{"name":"app_progress4.apk","size":2353734,"eTag":"92b10892"}
W1221 11:09:48.027950 upload_content.go:168 uploading 2 to http://x.x.x.63:8080/7,2bafadaa4666: unmarshal http://x.x.x.63:8080/7,2bafadaa4666: invalid character '{' after top-level value
E1221 11:09:48.027965 filer_server_handlers_write_upload.go:209 upload error: unmarshal http://x.x.x.63:8080/7,2bafadaa4666: invalid character '{' after top-level value
I1221 11:09:48.028022 common.go:70 response method:POST URL:/buckets/chrisDir/ with httpStatus:500 and JSON:{"error":"unmarshal http://x.x.x.63:8080/2,2ba84b2894a7: invalid character '{' after top-level value"}

In the master log, I see that the second volume server was added successfully and master.toml file was executed to rebalance

I1221 11:36:09.522690 node.go:225 topo:DefaultDataCenter:DefaultRack adds child x.x.x.64:8080
I1221 11:36:09.522716 node.go:225 topo:DefaultDataCenter:DefaultRack:x.x.x.64:8080 adds child
I1221 11:36:09.522724 master_grpc_server.go:138 added volume server 0: x.x.x.64:8080 [3caad049-38a6-43f6-8192-d1082c5e838b]
I1221 11:36:09.522744 master_grpc_server.go:49 found new uuid:x.x.x.64:8080 [3caad049-38a6-43f6-8192-d1082c5e838b] , map[x.x.x.63:8080:[5005b287-c812-4dba-ba41-9b5a6a022f12] x.x.x.64:8080:[3caad049-38a6-43f6-8192-d1082c5e838b]]
I1221 11:36:09.522866 volume_layout.go:393 Volume 11 becomes writable
I1221 11:36:09.522880 master_grpc_server.go:199 master see new volume 11 from x.x.x.64:8080
I1221 11:38:33.481721 master_server.go:323 executing: lock []
I1221 11:38:33.482821 master_server.go:323 executing: ec.encode [-fullPercent=95 -quietFor=1h]
I1221 11:38:33.483925 master_server.go:323 executing: ec.rebuild [-force]
I1221 11:38:33.484372 master_server.go:323 executing: ec.balance [-force]
I1221 11:38:33.484777 master_server.go:323 executing: volume.balance [-force]
2022/12/21 11:38:48 copying volume 21 from x.x.x.63:8080 to x.x.x.64:8080
I1221 11:38:48.486778 volume_layout.go:407 Volume 21 has 0 replica, less than required 1
I1221 11:38:48.486798 volume_layout.go:380 Volume 21 becomes unwritable
I1221 11:38:48.494998 volume_layout.go:393 Volume 21 becomes writable
2022/12/21 11:38:48 tailing volume 21 from x.x.x.63:8080 to x.x.x.64:8080
2022/12/21 11:38:58 deleting volume 21 from x.x.x.63:8080
....

How I start master

./weed master -mdir='.'

How I start volume

./weed volume -max=100 -mserver="x.x.x.61:9333" -dir="$dataDir"

How I start filer and s3

./weed filer -master="x.x.x.61:9333" -s3

What's in $HOME/.seaweedfs

drwxrwxr-x  2 seaweedfs seaweedfs 4096 Dec 20 16:01 .
drwxr-xr-x 20 seaweedfs seaweedfs 4096 Dec 20 16:01 ..
-rw-r--r--  1 seaweedfs seaweedfs 2234 Dec 20 15:57 master.toml

Content of master.toml file

# Put this file to one of the location, with descending priority
#    ./master.toml
#    $HOME/.seaweedfs/master.toml
#    /etc/seaweedfs/master.toml
# this file is read by master

[master.maintenance]
# periodically run these scripts are the same as running them from 'weed shell'
scripts = """
  lock
  ec.encode -fullPercent=95 -quietFor=1h
  ec.rebuild -force
  ec.balance -force
  volume.deleteEmpty -quietFor=24h -force
  volume.balance -force
  volume.fix.replication
  s3.clean.uploads -timeAgo=24h
  unlock
"""
sleep_minutes = 7          # sleep minutes between each script execution


[master.sequencer]
type = "raft"     # Choose [raft|snowflake] type for storing the file id sequence
# when sequencer.type = snowflake, the snowflake id must be different from other masters
sequencer_snowflake_id = 0     # any number between 1~1023


# configurations for tiered cloud storage
# old volumes are transparently moved to cloud for cost efficiency
[storage.backend]
[storage.backend.s3.default]
enabled = false
aws_access_key_id = ""         # if empty, loads from the shared credentials file (~/.aws/credentials).
aws_secret_access_key = ""     # if empty, loads from the shared credentials file (~/.aws/credentials).
region = "us-east-2"
bucket = "your_bucket_name"    # an existing bucket
endpoint = ""
storage_class = "STANDARD_IA"

# create this number of logical volumes if no more writable volumes
# count_x means how many copies of data.
# e.g.:
#   000 has only one copy, copy_1
#   010 and 001 has two copies, copy_2
#   011 has only 3 copies, copy_3
[master.volume_growth]
copy_1 = 7                # create 1 x 7 = 7 actual volumes
copy_2 = 6                # create 2 x 6 = 12 actual volumes
copy_3 = 3                # create 3 x 3 = 9 actual volumes
copy_other = 1            # create n x 1 = n actual volumes

# configuration flags for replication
[master.replication]
# any replication counts should be considered minimums. If you specify 010 and
# have 3 different racks, that's still considered writable. Writes will still
# try to replicate to all available volumes. You should only use this option
# if you are doing your own replication or periodic sync of volumes.
treat_replication_as_minimums = false

System status

curl http://localhost:9333/dir/assign?pretty=y
{
  "fid": "9,2bb2fd75d706",
  "url": "x.x.x.63:8080",
  "publicUrl": "x.x.x.63:8080",
  "count": 1
}

curl http://x.x.x.61:9333/cluster/status?pretty=y
{
  "IsLeader": true,
  "Leader": "x.x.x.61:9333",
  "MaxVolumeId": 21
}

curl "http://x.x.x.61:9333/dir/status?pretty=y"
{
  "Topology": {
    "Max": 200,
    "Free": 179,
    "DataCenters": [
      {
        "Id": "DefaultDataCenter",
        "Racks": [
          {
            "Id": "DefaultRack",
            "DataNodes": [
              {
                "Url": "x.x.x.63:8080",
                "PublicUrl": "x.x.x.63:8080",
                "Volumes": 20,
                "EcShards": 0,
                "Max": 100,
                "VolumeIds": " 1-10 12-21"
              },
              {
                "Url": "x.x.x.64:8080",
                "PublicUrl": "x.x.x.64:8080",
                "Volumes": 1,
                "EcShards": 0,
                "Max": 100,
                "VolumeIds": " 11"
              }
            ]
          }
        ]
      }
    ],
    "Layouts": [
      {
        "replication": "000",
        "ttl": "",
        "writables": [
          6,
          1,
          2,
          7,
          3,
          4,
          5
        ],
        "collection": "chrisDir"
      },
      {
        "replication": "000",
        "ttl": "",
        "writables": [
          16,
          19,
          17,
          21,
          15,
          18,
          20
        ],
        "collection": "chrisDir2"
      },
      {
        "replication": "000",
        "ttl": "",
        "writables": [
          8,
          12,
          13,
          9,
          14,
          10,
          11
        ],
        "collection": ""
      }
    ]
  },
  "Version": "30GB 3.37 438146249f50bf36b4c46ece02a430f44152777f"
}
1

There are 1 answers

0
chrislusf On

Only volume 11 was created on the second volume server. Need to rebalance first.