Insights

SEO Optimization through Transformer Models

Improve product descriptions utilizing cutting-edge SEO techniques with transformer models such as BERT and GPT-3 to enhance search visibility and clarity.

Last modified on March 20, 2026 • 8 min read

AI Research

Link copied to clipboard

SEO Optimization through Transformer Models

Transformer models, including BERT (Devlin et al., 2018) and GPT-3 (Floridi & Chiriatti, 2020), have made remarkable strides in Natural Language Processing (NLP), achieving near-human capabilities in tasks like text creation and paraphrasing. Although GPT-3 can produce product descriptions that account for subjective aspects like writing style, there are concerns regarding the quality and informative nature of these outputs.

This blog post examines the process of fine-tuning product descriptions to enhance SEO performance for frequently searched keywords. Important SEO considerations encompass keyword density, readability, and total word count. It is noteworthy that Google’s search algorithm now incorporates Transformer models (Vaswani et al., 2017), influencing approximately 10% of global search rankings.

BERT is utilized in about one out of ten searches to improve the understanding of webpage content. When Google encounters difficulties comprehending a webpage, human readers are likely to struggle as well. This article outlines the key components of an SEO score and discusses how specific factors can be improved using an SEO-focused model.

Constructing the SEO Score

The proposed SEO score comprises seven sub-scores, each assigned a different weight. The overall score is a weighted average derived from these individual scores. Various SEO tools, like Yoast and SEMrush , utilize similar metrics.

Keyword Density

Keyword density reflects how frequently a keyword is mentioned in a text, with an optimal range of 1-2% ( blog.alexa.com ) to prevent keyword stuffing. Keywords may relate to product categories or brands, or be generated using the Google Ads API . The score is inversely proportional to keyword density.

Query-Text Score

This metric evaluates how effectively a text aligns with Google’s ranking criteria by extracting the top 10 keywords for a category or brand via the Google Keyword Planner and then determining the cosine similarity between these keywords and the text using Sentence-BERT (Reimers & Gurevych, 2019).

Word Count

The recommended word count varies by content type; typically, blog entries contain more words than product descriptions. The key focus is ensuring Google can grasp the content, irrespective of its length. This score serves as a supplementary measure rather than a focal point.

Sentence Length

Sentence length impacts readability; sentences under three words are invalid, and excessively lengthy sentences can hinder comprehension. Ideally, only 25% of sentences should exceed 25 words ( medium.com ), with higher proportions resulting in lower scores.

Passive vs. Active Voice Score

Utilizing active voice enhances readability. Although Google can understand both active and passive constructions (Warstadt & Bowman, 2019), active voice tends to be more straightforward for readers, thereby improving SEO scores ( developers.google.com ). Texts where over 10% of sentences are passive receive penalties. The passive classification in Dutch is managed by a specialized BERTje model (de Vries et al., 2019) ( huggingface.co ). For English, passive voice detection can be conducted using this code: github.com .

Use of Transition Words

Transition words facilitate improved readability and narrative flow. The score is determined as the percentage of sentences featuring transition words, with an optimal ratio being 30%.

Readability Score

The Flesch Reading Ease score gauging readability ranges from 0 to 100, with a preferred span of 60-80 for product descriptions. Deviations from this range are subject to penalties.

Data

To compute SEO scores and enhance texts via the SEO model, three datasets were analyzed. Two sets originate from Squadra Machine Learning Company’s service, Powertext.ai , while the third comes from Promptcloud, containing English product descriptions from Victoria’s Secret . The assembled data includes 10,500 English texts and 718 Dutch texts.

Dataset	Description
Shoes	500 English and 500 Dutch product descriptions about shoes generated using Powertext.ai.
Washing machines	218 Dutch product descriptions about washing machines generated using Powertext.ai.
Victoria’s Secret	535,600 English product descriptions of underwear and swimwear from 9 websites, with 10,000 texts randomly selected for SEO scoring.

SEO Scores

SEO scores were computed for each dataset, showing minimum, average, and maximum scores. Keywords were predetermined for consistency.

Dataset	Keywords
Shoes (English)	shoe, shoes, walking
Shoes (Dutch)	schoen, schoenen, lopen
Washing machines	wasmachine, wassen, kleding
Victoria Secret	bra, thong, body, panty, sexy

The calculated SEO scores are as follows:

Dataset	Min	Mean	Max
Shoes (English)	0.520	0.692	0.820
Shoes (Dutch)	0.600	0.776	0.870
Washing machines	0.630	0.790	0.910
Victoria’s Secret	0.270	0.591	0.820

Average individual scores, excluding word count, are summarized below. Scores are presented as ‘score (weight)’:

Dataset	Keyword density (2)	Query-Text (3)	Sentence length (1)	Passive vs Active (2)	Transition words (2)	Readability (3)
Shoes (English)	0.694	0.251	1.000	0.723	0.820	0.821
Shoes (Dutch)	0.693	0.443	0.994	0.637	0.968	0.980
Washing machines	0.772	0.509	0.732	0.922	0.952	0.838
Victoria’s Secret	0.927	0.082	0.628	0.976	0.287	0.709

SEO Model

Enhancing a text for a higher SEO score is intricate. One strategy involves the use of a GAN to produce texts, using the SEO score as a loss function. Alternatively, a paraphrasing model can be implemented, with progress verified through the SEO score. Our approach emphasizes readability, particularly focusing on passive versus active voice, transition words, and readability scores.

We fine-tuned GPT-3 ( beta.openai.com ) on 100 input-output pairs to boost text clarity. A text is deemed improved if the collective score of these three measures and the overall SEO score rise, while the new content remains comparable to the original. Similarity is assessed via Sentence-BERT. Below are examples from the Victoria’s Secret and Washing Machines datasets, illustrating old and new versions with their corresponding SEO scores.

A naturally sexy unlined shape, with a lower cut that’s perfect for revealing necklines. Designed to disappear under curve-hugging styles. Lift & Lining Unlined; lace choices are unlined with a foam sling for light lift Underwire cups Straps & Hooks Adjustable straps can convert to crossback and snap into place for a secure hold Back closure Double row of hook and eye closures; Sizes 34DDD, 36DD-36DDD & 38D-38DDD have triple row of closures for a secure, comfortable fit 4 settings to ensure a perfect fit Details & Fabric Double-lined sides for the smoothest shape Imported nylon/spandex

A naturally sexy unlined shape, with a lower cut that’s perfect for revealing necklines. The bra is designed to disappear under curve-hugging styles, with lift and lining that’s unlined and has a foam sling for light lift. The bra has an underwire cup, adjustable straps, and hooks that can convert to crossback and snap into place for a secure hold. The back closure features double rows of hook and eye closures, while the sizes 34DDD, 36DD-36DDD & 38D-38DDD have triple rows of closures for a perfect fit.

A true wardrobe essential with soft, no-show cups and a comfortable fit! Includes removable straps. Most push Structured, underwire cups Straps included Hook-and-eye closure Imported polyamide/spandex

This true wardrobe essential has soft, no-show cups and a comfortable fit. It includes removable straps so you can wear it anywhere. The bra is made of breathable materials such as polyamide and spandex, making it versatile for any occasion!

De digital inverter motor zorgt voor een geluidsniveau van slechts 51 db, wat echt stil is. Ook het droogresultaat van de aeg wasmachine l7wb86gw is van topniveau dankzij de centrifuge die een maximum toerental van 1600 rpm rotaties per minuut kan bereiken. Met het wassen verkrijg je 51 db, bij het centrifugeren hoor je maximaal 76 db Zijn laadvermogen van 8 kg is ruim te noemen en wat je ook wilt wassen, met zijn 16 voorgeprogrammeerde wascycli ben jij in staat van wassen echt maatwerk te maken.

De digital inverter motor zorgt voor een geluidsniveau van slechts 51 db, wat echt stil is. Ook het droogresultaat van de aeg wasmachine l7wb86gw is van topniveau, dankzij de centrifuge die een maximum toerental van 1600 rpm kan bereiken. Met het wassen verkrijg je 51 db, bij het centrifugeren hoor je maximaal 76 db. Het laadvermogen van 8 kg is ruim te noemen en wat je ook wilt wassen, met zijn 16 voorgeprogrammeerde wascycli ben jij in staat van wassen echt maatwerk te maken.

Conclusion

This blog outlines the essential principles of SEO and methods for assessing texts based on these criteria. We exemplified the SEO model’s effectiveness in improving text clarity and elevating SEO scores. Nonetheless, the model does not guarantee consistent readability enhancements due to GPT-3’s inherent unpredictability. Future developments may involve integrating an Encoder-Decoder model to convert passive sentences to active constructions. While the model is still under development due to data constraints, the existing SEO scores offer critical insights for areas needing improvement, and the SEO model has already yielded promising outcomes in text optimization.

References

Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.

Floridi, L., & Chiriatti, M. (2020). GPT-3: Its nature, scope, limits, and consequences. Minds and Machines, 30(4), 681-694.

Reimers, N., & Gurevych, I. (2019). Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … & Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998-6008).

de Vries, W., van Cranenburgh, A., Bisazza, A., Caselli, T., van Noord, G., & Nissim, M. (2019). Bertje: A dutch bert model. arXiv preprint arXiv:1912.09582.

Warstadt, A., & Bowman, S. R. (2019). Linguistic analysis of pretrained sentence encoders with acceptability judgments. arXiv preprint arXiv:1901.03438.

Guus van de Mond

Interested in this topic?

Please leave your contact details so we can get in touch.

Get in touch

ShoppingTomorrow Expert Group 2025: PIM & AI Agents.

Selecting the Right PIM System

Link copied to clipboard

Interested in this topic?

Guus van de Mond

Please leave your contact details so we can get in touch.

Get in touch