Where does PDF/Office documents generation fits into clean/onion architecture

53 views Asked by At

I need to add docx/excel reports into following solution solution explorer

Question

Where does my document generation fit?

Solution description

  • Presentation references Application
  • Application references Domain
  • Infrastructure isn't referenced directly from above projects
  • The numbering doesn't match dependency direction! E.g. every project referenced Infrastructure.CrossCutting, but none (except DI) references DataAccess.

Reports often query data across bounded contexts, might include some business logic and have framework dependencies (docx manipulation library), but on the other side queries must be performance optimized (sometimes direct data access instead of using repositories)

Tricky part is to find a right balance between separating architecture concerns and maintenance costs, since having reports scattered across 4 layers is not ideal either.

2

There are 2 answers

0
desertech On

Generating reports (actual documents in your case) means you can start by defining a repository object for that purpose exactly:

// Domain
public interface IReportRepository
{
    string AddReport(IReportData data);
}

// Infrastructure
public class ReportRepository : IReportRepository
{
    public string AddReport(IReportData data)
    {
        // Use docx library (framework dependency) to generate report using 'data',
        // and return path to file.
    }
}

Assuming data for report is aggregated by fetching multiple component from different bounded contexts, you may stick to the repository pattern and define repository objects for the different components:

// Domain
public interface IReportDataProvider
{
    object GetReportData();
}

// Infrastructure
public class ReportDataProvider1 : IReportDataProvider
{
    ...
}

// Infrastructure
public class ReportDataProvider2 : IReportDataProvider
{
    ...
}

Use-case implementation:

{
    ...
    object data1 = this.provider1.GetReportData();
    object data2 = this.provider2.GetReportData();
    IReportData data = this.Process(data1, data2);
    string filePath = this.reportRepository.AddReport(data);
    ...
}

IReportDataProvider can either be implemented as an API call or as direct data access for better performance. It really depends on how you perceive this reports API.

If reports API functions as an orchestrator, you should set providers to call external APIs (in which case, external APIs do their own logics while Process method applies aggregation logic only).

But, if this API is more like a whole processor, then each provider may directly fetch data (in which case, Process method applies full business logic: logics for data1 and data2, plus aggregation logic).

0
René Link On

A PDF is a presentation issue. There is no difference between rendering html and a browser that displays it and generating (rendering) a PDF that a PDF reader displays. So it is a view in the UI layer. You can apply the same architecture patterns as you would do on any other presentation technology.

In the simplest szenario you have a controller that is executed in order to create a report. The controller collects data from ui models, creates a request model and invokes the use case - or the InputPort in more detail. When the controller invokes the input port it passes an output port implementation, a presenter that can format the response model values to a human readable form - a ui model. After this is done the ui model that contains the report data is then passed to a view that can render or display it.

Reports often query data across bounded contexts, might include some business logic and have framework dependencies (docx manipulation library), but on the other side queries must be performance optimized (sometimes direct data access instead of using repositories)

When you have to generate reports that span multiple bounded contexts you usually have a separate report bounded context. In such situations I would let the other bounded contexts to emit domain events that are processed by the report component. This means that the report component would keep copies or partial copies of domain information of the other contexts and persists them in a way they can be efficiently accessed. Sure, if you do that you increase the storage that is used and you might have to deal with eventual consistency issues, but your report component keeps doing it's job even when the other components are not available, e.g. in a microservice architecture. And since your report component keeps copies in a way they can be efficiently accessed, you have full control over performance issues. It might be a separate database engine or at least separate tables that can have the indexes you need for reports.

You can also go the on-the-fly approach and collect all data from all components when the report is requested. In this case you will not duplicate data and you might not have to deal with eventual consistency issues. But each component that is queried for data has to collect and transform it in the structure you need when executed. So it can not be as performant as if the data would have been collected in a report friendly way before. All components must be up and running at the time thre report is requested. In a microservice architecture it can become cumbersome to ensure this and it creates dependencies on the microservices.

Tricky part is to find a right balance between separating architecture concerns and maintenance costs

Yes, that's the job of software architects. We have to make a guess based on the companies condition and goals it has and adjust our architecture when the goals or conditions change. That's one aspect why we should have an eye on testability, because we need to refactor from time to time.