Creating Embedding Matrix for LSTM Model with BERT Feature Representations on Arabic Dataset

44 views Asked by Researcher At 30 December 2023 at 21:05

I'm working on implementing an LSTM model for an Arabic dataset using BERT feature representations. I've utilized the 'asafaya/bert-base-arabic' model for this purpose:

bert_model = AutoModelForMaskedLM.from_pretrained('asafaya/bert-base-arabic')

Now, I'm facing the challenge of creating an embedding_matrix to be used in the subsequent statement:

`model_LSTM.add(Embedding(vocab_length, embedding_vector_features, weights=[embedding_matrix], input_length=length_long_sentence))

Given that BERT provides contextual embeddings, the feature representation for the same word varies based on context.

I would appreciate any guidance or suggestions on how to effectively create the embedding matrix for this scenario. Thank you!

I tried the following

`def bert_embedding_matrix():
 bert = AutoModelForMaskedLM.from_pretrained("asafaya/bert-base-arabic",
 output_hidden_states = True,)
 bert_embeddings = list(bert.children())[0]
 bert_word_embeddings = list(bert_embeddings.children())[0]
 
 mat = bert_word_embeddings.word_embeddings.weight
return mat

embedding_matrix = bert_embedding_matrix()

but I have the following error ValueError: Layer embedding_1 weight shape (8155, 300) is not compatible with provided weight shape torch.Size([32000, 768]).

Original Q&A

TechQA.

Creating Embedding Matrix for LSTM Model with BERT Feature Representations on Arabic Dataset

There are 0 answers

Related Questions in PYTHON

Related Questions in LSTM

Related Questions in ARABIC

Related Questions in BERT-LANGUAGE-MODEL

Related Questions in TEXT-CLASSIFICATION

Popular Questions

Trending Questions