Inspiration Content Engine powered by AI

tneogi
4 min readJan 27, 2020

I spent a few hours one Sunday evening, outlining how AI and web data can be used to build a content engine focused around daily inspirational content, personalized for users. This post is a result of that evening’s ramblings. The ideas here are only presented as is and need further vetting and analysis.

Problem Statement

  1. How can we use AI for curating inspiration content across the web?
  2. How can we use AI for producing inspiration content from across different verticals?
  3. How can we use AI for the recommendation and personalization of storytelling?

Theoretical approach — Summary

Inspirational content is a combination of two factors.

Personal Factors
Different people react to and are inspired by different kinds of content depending on -

  • Profession
  • Knowledge / Education background
  • Social-cultural background
  • Goals in life at that point in time
  • Emotional and personal status or stage in life
  • Time of day, year and surrounding social / cultural aspects

Point in time factors —
Different kinds of content inspire us at different points in time. Some examples are -

  • Newsbyte at the breakfast
  • Podcasts on the way to work
  • A statistic in the middle of the day
  • A video clip over evening coffee

Expressed mathematically, Let’s denote Personal Factors as a vectors P1, P2 … Pn and Point in time factor as a vector T1, T2 … Tn, Inspiration I is

I = P(n) X T(n).

Inspiration is a cross product of the sum of all vectors of personal aspects of the user and the sum of all vectors of point in time that the user is accessing content.

Feature Engineering

The most critical aspect of any AI project is to be able to identify the exact parameters that impact the problem at hand. These parameters (vectors) are the features of the data. Identifying the features is known as the process of Feature Engineering. Feature engineering is 80% of any AI project.

For this project, Feature engineering will be focussed on finding all vectors that impact all aspects of the content consumption of a user.

This is going to be an iterative process, where we will start with some assumptions, and keep measuring results and becoming better.

The Inspiration Matrix

The key challenge of an AI that can identify, produce and recommend inspirational content is to be able to find a list of all factors in play for each dimension. This cross product can be stated as a simplified matrix structure lie so:

A 2 Dimensional Representation of an Inspirational Matrix

Note that in reality, this is a nXn matrix and hard to represent in 2 dimensions.

This n-dimensional grid can be coded into an AI layer that forms the basis of an Inspirational content engine. This engine can be used by an app, web, or other platform via an API, to provide content to users.

Inspiration API — High-level architecture

An AI stack is only as good as the data it can get access to for training, and only as powerful as the API through which it is made available to the outside world. At a high level, the Inspiration API might look like so

High Level Diagram of Inspiration AI Stack
  1. External Content sources and crawlers — Since all content cannot be produced in house, crawlers that can curate content from the web will be required to be built
  2. ElasticSearch Index — All the curated content will be stored into a single ES Index for further text processing.
  3. NLP Layer — To identify content attributes and map them to Inspiration requirements, NLP will be used to extract entities, emotions, etc.
  4. Preference Layer — To map a user’s preferences and motivation requirements and preference layer will need to be built.
  5. Rest API — The entire stack will be exposed as a REST API that can be used by the YS platform and app

Conclusion

I firmly believe that increasingly more and more content production, delivery and distribution will happen via AI — especially with the onset of GPT-2 and BERT, we will see complete AI-powered content engines with no human intervention. Digital media and content businesses will be forced to compete and adopt AI-powered publishing engines because the cost of content production, along with high volumes and accuracy that AI can provide, will be impossible to beat.

This post is part of my long-standing personal research around this area. Hit me up on titash at laxmi dot ai if you want to know more about the work I am doing in this area.

--

--