erDiagram COUNTRY ||--|{ POLICY: country_id COUNTRY ||--|{ REQUEST: country_id COUNTRY ||--|{ REQUEST_DETAIL: country_id COUNTRY ||--|{ CYBER_ATTACK: country_id COUNTRY ||--|{ EDGELIST: country_id COUNTRY { string country_id PK float covariates } POLICY { string url PK string country_id FK datetime issue_time string title string text category document_type category issuing_entity category type_of_law category type_of_liability } REQUEST { string country_id PK datetime request_time PK category reason PK integer n_requests } REQUEST_DETAIL { string country_id PK datetime request_time PK category requestor PK integer n_requests integer n_items_requested integer n_removed_legal integer n_removed_policy integer n_no_action_taken } CYBER_ATTACK { string source1 PK string country_id FK string victim FK category type } EDGELIST { string country_id FK string interaction PK string interaction_strength }
6354proposal
Datebase Purpose
The motivation of this database is to create a database about internet censorship, cyber attacks and social network among the countries for future research usage.
This database enable researchers to answer the following questions: How does the censorship rules spread across countries that share stronger relations? Why some countries are considered greater threats even though they initiate less cyber attacks? How does recent AI technology adoption affect the type of censorship and cyber attacks?
Schema
Methods and Explanation
COUNTRY
The country df contains all country level covariates. The unit is country and the country id is the primary key
POLICY
The policy df contains all policy level information from Wilmap. The unit is a piece of policy, and their unique urls serve as primary key. Country id is the foreign key here. The df also contains policy wise information such as their text and labels that are already classified.
REQUEST and REQUEST_DETAIL
The request and request detail dfs are the takedown initiations submitted by the governments to the Google platform. The request df’s unit is country-date-reason, using (country_id, request_time, reason) as primary key. The values in it are country-date level frequencies for each reasons. As for the detailed request df, it uses (country_id, request_time, requestor) as primary key, further separating the requests by the countries’ requestor agencies.
CYBER ATTACK
The cyber attack df is merged from 2 different sourced dfs that records the cyber attacks. The unit is incident and the sources’ unique url will serve as a primary key. The initiator and victim countries also uses country id as values, served as foregin keys.
EDGE LIST
The edge list is a countries pairs social network data that records the countries interactions and their relations. The unit will be a dyad of two countries, but with a country column that storage the dyad relations of such country which also conveniently served as a primary key.
Database Interface
flowchart TB user[user] <--> network((country network)) subgraph frontend [frontend] events(events) --> network((country network)) end network((country network)) --> database[(Database)] subgraph backend [backend] database[(Database)] --> response(data response) end response(data response) --> events(events) response(data response) --> network((country network))