Using dynamic schema with postgres

3.3k views Asked by At

I have to decide how I want to implement dynamic schema. I will try to put some more description where my problem is.

I use the postgres DB in latest version and app is written in Ruby and Hanami. I have requirement to implement view where I will display list of users in our application (table view). Here comes first requirement that it needs to contain dynamic data. For example user has name and birth_date and I store this data in Users table. In other table I store metrics related to user like number of views and comments. This is only part of data. I have couple more tables related to my User and I need to join them into one view. It doesn't sound very dynamic. I know. But I have another requirement. I want to allow users to add custom column to my table view.

At this point we have table with following predefined columns:

name, birth_date, number_of_views, number_of_comments 

and we can add some custom ones.

Now It becomes dynamic. It is not the end. I need to add filtering on this columns. So for example if given user has columns I mentioned earlier I should allow to filter on all of them (predefined and custom). I am trying to investigate how I should implement this feature.

Now the question part :) I am thinking about how I should store this data. I am thinking about storing it as a JSONB column in my postgres. Scale of this is about 5kk of users. So this mean 5kk rows in this table. Not so much. I am wondering about filtering this data using dynamic filters. I tried to make some benchmarks for 5kk table and it worked well if I used pagination (without indexes). For querying whole dataset it took like 10sec so wasn't perfect.

Questions

  1. Is JSONB good idea? I am not sure If I can define proper indexes if I will filter dynamically on this data (every query can be different). How to define indexes for this scenario?
  2. Is a good idea to store predefined columns as normal columns in table and only custom ones in JSONB? Will it give me any profit?
  3. I should I use postgres to that? I don't believe that postgres isn't good enough for this scale and I prefer to not change tool.

Thanks for all answers. I don't have huge experience with dynamic schemas so every tip can be usefull ;P

2

There are 2 answers

0
AudioBubble On

I would recommend using NoSQL database like mongoDB. Your problem statement requires schema-less document store supporting fast reads ie. MongoDB or Couchbase which has great support

1
waldir dias On
    -- DROP FUNCTION IF EXISTS padrao.func_create_schema_and_table(text);

CREATE OR REPLACE FUNCTION padrao.func_create_schema_and_table(
    sufixo_schema text)
    RETURNS boolean
    LANGUAGE 'plpgsql'
    COST 100
    VOLATILE PARALLEL UNSAFE
AS $BODY$
declare
    sql_cod text;
begin  
    --STRING QUE ARMAZENA O CÓDIGO QUE SERÁ EXECUTADO
    sql_cod = '
    
----------------------------------- CRIAÇÃO SCHEMA ----------------------------------------  
    
CREATE SCHEMA IF NOT EXISTS schema_'|| sufixo_schema ||' AUTHORIZATION pitjndjzwxxx;
    
---------------------------------- CRIAÇÃO DAS TABELAS ------------------------------------

CREATE TABLE IF NOT EXISTS schema_'|| sufixo_schema ||'.produtos (
    "pk_id_produto" serial NOT NULL,
    "fk_software_user" serial NOT NULL,
    "produto_codigo" varchar(10) NOT NULL UNIQUE,
    "produto_descricao" varchar(255) NOT NULL,
    "produto_atualizado" varchar(20) NOT NULL,
    "produto_situacao" varchar(20),
    "produto_dados" jsonb,
    PRIMARY KEY ("pk_id_produto"),
    UNIQUE (produto_codigo,fk_software_user)
); 
/*----------------------------------------------------------------------------------------------*/  
    ';
    
    --EM CASO DE ERRO SERÁ RETORNADO A MENSAGEM ABAIXO
    raise notice '### ERRO NA EXECUÇÃO DA FUNÇÃO POSTGRES ###';
    
    --EXECUÇÃO DO CÓDIGO QUE FOI SALVO NA VARIÁVEL:sql_cod 
    EXECUTE format(sql_cod, sufixo_schema);

    return true; 
          
END;
$BODY$;

ALTER FUNCTION padrao.func_create_schema_and_table(text)
    OWNER TO pitjndjzwqrxxx;