Understanding how to read maven central indexes/ and when to use luke tools

197 views Asked by At

I am currently working on a small project to read the entire details of artifacts on maven central. I came across this particular website https://maven.apache.org/repository/central-index.html It lists essentially a 3 step:

  1. downloading the index
  2. using index-cli project to unpack the index.gz
  3. and then using a lucene viewer step such as Luke to export the index as an xml

However I was going through the index-reader examples and unit tests (https://maven.apache.org/maven-indexer/indexer-reader/index.html) and it is very clear that simply using ChunkReader.splitIterator is sufficient to get all the details that we get after the 3 step output above.

Infact even the link suggests the same.

Verbatim - "Indexer Reader is a dependency-less library that is able to read published (remote) index with incremental update support, making usable to integrate published Maven Indexes into any engine without depending on maven-indexer-core and its transitive dependencies."

Question 1: The confusion and the question : why does this https://maven.apache.org/repository/central-index.html suggest a 3 step workflow to achieve the same.

Question 2: Is there any additional clarity somewhere which explains how and when the incrementals and the full upload happens? I found this blog post from 2009, just wanted to ensure it is still valid https://blog.sonatype.com/2009/05/nexus-indexer-20-incremental-downloading/

Regards

0

There are 0 answers