Compute CID of data in Go

186 views Asked by At

I need to compute the CID of some data in Golang, right now I'm using this code, but the CID I receive is different than what I get when I upload the file to Filecoin or add it to a Lotus node.

This is my code:

"github.com/ipfs/go-cid"
ma "github.com/multiformats/go-multiaddr"
mh "github.com/multiformats/go-multihash"

func Cid(data []byte) {
    hash, err := mh.Sum(data, mh.SHA2_256, -1)
    if err != nil {
        panic(err)
    }

    c, err := cid.V1Builder{Codec: cid.DagProtobuf, MhType: mh.SHA2_256, MhLength: -1}.Sum(hash)

    fmt.Println("Filecoin CID:", c.String())
}

If I have a file with the content Hello, the above function prints the following cid: bafybeievjvim3p2qexhlxek5yqftxsizm6fx2mtrgrsyfx43f52sozkwga If I add it to Lotus, or upload to Filecoin I get: bafkreiayl6g3gitr7ys7kyng7sjywlrgimdoymco3jiyab6rozecmoazne

I would like to have a function that returns me the same CID.


2

There are 2 answers

0
lajosdeme On BEST ANSWER

I figured it out, I had to chunk the data as well. My updated code is below. If I now upload the same file to Filecoin I get the same CID.

 import (
     "bytes"
     "fmt"
     "io"
     "os"
     "github.com/ethereum/go-ethereum/common"
     chunker "github.com/ipfs/boxo/chunker"
     "github.com/ipfs/go-cid"
     "github.com/libp2p/go-libp2p/core/host"
     mh "github.com/multiformats/go-multihash"
)

func Cid(data []byte) {
// Create an IPLD UnixFS chunker with size 1 MiB
chunks := chunker.NewSizeSplitter(bytes.NewReader(data), 1024*1024)

// Concatenate the chunks to build the DAG
var buf bytes.Buffer
for {
    chunk, err := chunks.NextBytes()
    if err == io.EOF {
        break
    } else if err != nil {
        panic(err)
    }

    buf.Write(chunk)
}

// Calculate the CID for the DAG
hash, err := mh.Sum(buf.Bytes(), mh.SHA2_256, -1)
if err != nil {
    panic(err)
}

// Create a CID version 1 (with multibase encoding base58btc)
c := cid.NewCidV1(cid.DagProtobuf, hash)

// Print the CID as a string
fmt.Println("IPFS CID in Golang:", c.String())
}
0
Andrei Vukolov On

To correctly calculate the CID, the DAG should be built over the chunked data, as in IPFS CID represents DAG, but not the data itself. lajosdeme's answer works for the files that are not so large because they could be represented by the linear list with a single branch containing the whole DAG. Otherwise, this method does not work. The proper way here is to call the DAG builder on the /dev/null blockstore. The easiest way to do this is to use routines from Boxo library.

Then the exact code for files (it does not support directories, this require additional efforts) looks like:

import (
    "bytes"
    "os"
    chunker "github.com/ipfs/boxo/chunker"
    "github.com/ipfs/go-cid"
    dsync "github.com/ipfs/go-datastore/sync"
    multicodec "github.com/multiformats/go-multicodec"
    "github.com/ipfs/boxo/ipld/unixfs/importer/balanced"
    uih "github.com/ipfs/boxo/ipld/unixfs/importer/helpers"
    "github.com/ipfs/go-datastore"
    blockstore "github.com/ipfs/boxo/blockstore"
    "github.com/ipfs/boxo/blockservice"
    offline "github.com/ipfs/boxo/exchange/offline"
    "github.com/ipfs/boxo/ipld/merkledag"
)

func Cid(filename string) string {
    fileData, err := os.ReadFile(filename)
    if err != nil {panic(err)}
    fileReader := bytes.NewReader(fileData)
    ds := dsync.MutexWrap(datastore.NewNullDatastore())
    bs := blockstore.NewBlockstore(ds)
    bs = blockstore.NewIdStore(bs)
    bsrv := blockservice.New(bs, offline.Exchange(bs))
    dsrv := merkledag.NewDAGService(bsrv)
    ufsImportParams := uih.DagBuilderParams{
        Maxlinks:  uih.DefaultLinksPerBlock, // Default max of 174 links per block
        RawLeaves: true,
        CidBuilder: cid.V1Builder{ // Use CIDv1 for all links
            Codec:    uint64(multicodec.Raw),
            MhType:   uint64(multicodec.Sha2_256), //SHA2-256
            MhLength: -1,
        },
        Dagserv: dsrv,
        NoCopy:  false,
    }
    ufsBuilder, err := ufsImportParams.New(chunker.NewSizeSplitter(fileReader, chunker.DefaultBlockSize)) // 256KiB chunks
    if err != nil {
        return cid.Undef.String()
    }
    nd, err := balanced.Layout(ufsBuilder)
    if err != nil {
        return cid.Undef.String()
    }
    return nd.Cid().String()
}

The viable example lives here: https://github.com/twdragon/ipfs-cid-local