Home » VR-REU 2022 » Arab data bodies – Arab futurism meets Data Feminism

Arab data bodies – Arab futurism meets Data Feminism

Mustapha Bouchaqour, CUNY New York City College of Technology

Week 1: Getting to know my team and the project’s goal.

I have joined Professor Laila in working on this Project. The idea of this project will be seeing as a story that reflect what have happened in Arab countries since Arab Spring (uprising) started in 2011. This story takes place in 2011. It is an Arab futuristic world where the history of the 21st century is one where data and artificial intelligence have created “data bodies (DB).” In a hundred years from now, individuality is created out of data. Human and non-human subjectivities can be born solely from data.

The idea then is to develop a game. This game uses real data from early 21st century uprising social movements – activating the 2011 Arab Uprising as ground zero – to create human non-human agents called data bodies. This week goal was to make sense of data collected and get to know more the team I am working with along with the blueprint we should design as the basic thing needed for developing the game.

Week 2: Analyzing data using NLP with first basic design using Unity 3d

My group so far is still working on developing a blueprint that will work as the basic foundation for the game. However, the unique final product that I am trying to deliver is centered about 2 concepts. The game is challenging the power. The data provided is categorized into emotional, experience, and historical data (Arab uprising 2011). The gap between analyzing data and implementing the game using Unity 3D is where I am working on right now. I am in process of analyzing data that was gathered between 2011 and 2013. I will be using Natural language processing (NLP) and design the basic animation needed for first stage.

Week 3: Deep dive into data

The dataset is held in MYSQL database. The data is split between a few different tables. The tables are as follows:

  • random key
  • session Tweet
  • User
  • Tweet
  • Tweet Test
  • Tweet URL
  • URL
  • Tweet Hashtag
  • Hashtag
  • Language
  • Session
  • Source

Based on the UML, there are 3 independent tables which are Language, session, and source. They have no direct connection using UML approach. However, I believe they are some intersections occurring within all tables in database.  . The way data was collected may lead to this view. In addition to that, the rest table seems to have an interesting intersection. Tweet tables has around 6 connections, in other words, it is connected to 6 tables which are random key, session tweet, user, tweet test, tweet hashtag, and tweet URL. Here are some fields related to tweet table:

The ‘tweet’ table glues everything together. It has the following columns:

  • twitter_id # I believe this twitter_id is also valid for the Twitter API, but I never tested to see if it was


  • geo # the geo data is PHP serialized GeoJSON point data (I believe lon lat), use a PHP deserializer to read it
  • source
  • from_user_id
  • to_user_id
  • lang_code
  • created_at

The ‘user’ table has the following:

  • user_id
  • username
  • profile_image_url # many of these are now broken, but some can be fixed by just modifying the hostname to whatever Twitter is using now

The ‘hashtag’ table has the following:

  • hashtag_name
  • Definition # these definitions were curated by Laila directly
  • Related_Country
  • Started_Collecting
  • Stopped_Collecting
  • hashtag_id

The ‘url’ table has the following:

  • url_id
  • url

You can look at a tweet’s user’s info by INNER Joining the tweet table with the user table on the from_user_id column of the tweet table.

Because tweets and hashtags, and also tweets and URLS, have a many-to-many relationship, they are associated by INNER JOIN’ing on these assocation tables:

  • tweetHashtag
  • tweetUrl

In addition to this, NLP model was developed to analyze data and prepare the pipeline needed for Unity 3D.

A simple UML model was built to check the tables relationship



Week 4: Storytelling using dataset from R-Shief.

My team ultimate goal is to create a virtual reality that project the story behind the data. This is a story set in the future that locates the 2011 Arab Uprisings as the birth of the digital activism we witnessed grow globally throughout the twenty-first century—from Tunis to Cairo to Occupy Wall Street, from 5M and 12M in Spain to the Umbrella Revolution in Hong Kong, and more. The player enters a public mass gathering brimming with the energy of social change and solidarity. The player has from sunrise to sunrise to interact with “data bodies.”

However, given the short time I have, and the deadline needed for coming up with a solid final product, I was guided by my mentor Professor Laila to work on the followings:

1 – Develop a homology visualization using the Tweet data August 2011 – # Syria

2 – Distributing the tweet data over several characters where we can see how data changed to be an emotional motion including but not limited to: Anger, Dance, Protest, Read, etc.

Week 5: Creating and visualizing network data with Gephi.

Getting access to R-Sheif server and using “tweet” table. First, nodes file was created by extracting all the user_id from tweet table. We assigned to each user_id specific reference or Id, and come up with a nodes fie contains “Id” and “label” columns. Edges file was created by checking the relationship between user_id within the “tweet” table. The “tweet” table contains two fields that demonstrate this relationship which are “from_user_Id” and “to_user_id”. The edges file then will contains many fields including the languages.

Note: Data used still has the same criteria which are:

  • Tweet contains “Syria
  • Data time: August 2011

An example of network data will look like this:

  • Each circle represents a node which is a user id
  • Edges are the connection between nodes
  • Edges with colors represents language that been linked to the tweet

Sentiments analysis using same data from tweet table:


The last graph is much better, allowing us to actually see some dips and trends in sentiment over time. Now all that is left to do is projecting these changes in sentiments over avatars we create using Unity3D.

Week 6:  keep working on the research paper and going over some ML-Agent in Unity 3D

Basically, this week my entire work focused on unity. I found out many resources talking about how to implement ML models into Unity3D. My goal is to distribute sentiments clusters over the characters I have. In addition, I worked on wrapping up the abstract needed for the research papers.

Week 7:  Finished Abstract and keep working on the research paper and ML-Agents in Unity 3D

I finished the research paper abstract along with the introduction. Figuring out how to implement ML-Agents in Unity 3D. Wrapping up the Demo

Started writing up the final presentation.

Week 8:  Deadline for a great Experience

During the journey of 8 weeks, I’ve learned a lot in this REU and get work out of my comfortable zoon. During this week, I focused on preparing the presentation and wrapping up the research papers.

Final Report

Hunter College
City University of New York
695 Park Ave
New York, NY 10065

Telephone: +1 (212) 396-6837
Email: oo700 at hunter dot cuny dot edu

Follow me:
wolexvr on Twitter  wole-oyekoya-5b753610 on Linkedin  woleucl's YouTube channel  oyekoya's Github profile  Googlescholar