Difference between revisions of "Database"

From Seth's Wiki
Jump to navigation Jump to search
 
(22 intermediate revisions by the same user not shown)
Line 1: Line 1:
*Disease -> Multiple symptoms
+
=Database Basics=
*Symptom -> Multiple diseases
+
*Relational databases (e.g. MySQL) are probably the most common. Essentially they store data in a set of 2D tables (relations) that follow certain rules of normalization and can be linked to each other via relational algebra (mostly set theory type functions).
 +
*To address problems encountered when trying to interface a RDBMS with an object-oriented programming language, alternative non-relational database structures (e.g. XML, NoSQL, Hierarchical, Network) are available and may have advantages in certain situations.
 +
*Tables (Relations/Relvars) are made up of Rows (Tuples/Records) and Columns (Attributes/Fields) where the intersection at a certain cell is called a value.
 +
*Attributes contain values of the same data type (domain), e.g. integer, string, boolean.
 +
*Each Tuple should have a primary key (normally an arbitrary integer) that identifies a unique tuple. Natural keys (those based on attributes) are possible as well but less optimized.
 +
*Databases can be indexed to streamline queries.
  
=Simple Example=
+
=Useful Resources=
 +
*[https://en.wikipedia.org/wiki/Relation_(database) Relation]
 +
*[https://en.wikipedia.org/wiki/Database_normalization#Satisfying_EKNF Database Normalization w/ step by step instructions w/ examples]
 +
*[https://en.wikipedia.org/wiki/Associative_entity Associative Entity, for resolving many-to-many cardinality]
 +
*[https://en.wikipedia.org/wiki/Primary_key Primary Key]
 +
*[https://en.wikipedia.org/wiki/Database_index Database Indexing]
 +
*[https://en.wikipedia.org/wiki/Relational_database Relational Database, lengthy overview]
 +
*[https://rstsaygili.medium.com/many-to-many-relationships-in-relational-databases-af867547914f Resolving Many-To-Many Relationships]
 +
*[https://builtin.com/data-science/sql-vs-nosql SQL vs NoSQL]
 +
 
 +
=Working Prototype=
 
{| class="wikitable sortable"
 
{| class="wikitable sortable"
! Disease !! Chest Pain !! Precipitated by Stress !! Tenderness !! Reflux !! Radiation to shoulder !! Radiation to back !! Fever !! Leukocytosis !! Elevated troponin !! ST elevation !! HDUS !! Absent breath sounds
+
! Disease !! Risk Factors !! Associated Conditions !! Complications !! Symptoms !! Vitals !! Signs !! Labs !! Imaging !! Studies
 +
|-
 +
| Primary Key || LIST of epidemiologic risk factors (age, sex, race, family history) || problem LIST at time of presentation || LIST of possible complications (screening and anticipatory guidance) || LIST of symptoms (keep it broad: chest pain rather than substernal acute pressure-like chest pain with radiation) || LIST of vital abnormalities || LIST of pertinent positive/negative physical exam findings || LIST relevant lab abnormalities || LIST of useful imaging findings || LIST relevant echo/ECG/PFT/sleep/etc studies
 +
|}
 +
{| class="wikitable"
 +
|+ Base Table
 +
|-
 +
! Attribute !! Domain !! Cardinality
 +
|-
 +
| Disease (PK) || text || --
 +
|-
 +
| Risk factors || text || M2M
 +
|-
 +
| Associated Conditions || text || M2M
 
|-
 
|-
| Stable angina  || 1 || 1 || 0 || 0 || 0 || 0 || 0 || 0 || 0 || 0 || 0 || 0
+
| Complications || text || M2M
 
|-
 
|-
| NSTEMI || 1 || 0 || 0 || 1 || 0  || 0 || 0 || 1 || 0 || 0 || 0
+
| Symptoms || text || M2M
 
|-
 
|-
| STEMI || 1 || 0 || 0 || 1 || 0 || 0 || 0 || 1 || 1 || 0 || 0
+
| Vitals || text || M2M
 
|-
 
|-
| GERD || 1 || 0 || 1 || 0 || 0 || 0 || 0 || 0 || 0 || 0 || 0
+
| Signs || text || M2M
 
|-
 
|-
| Esophageal rupture || 1 || 0 || 0 || 1 || 1 || 0 || 0 || 0 || 0 || 1 || 0
+
| Labs || text || M2M
 +
|-
 +
| Imaging || text || M2M
 
|-
 
|-
| Pericarditis || 1 || 0 || 0 || 0 || 0 || 1 || 1 || 0 || 0 || 0 || 0
+
| Studies || text || M2M
 +
|}
 +
 
 +
Each primary key (disease) can be associated with multiple values in each attribute. For example, CAD is associated with multiple risk factors (age, male gender, smoking, etc) and multiple symptoms (angina, dyspnea, nausea, etc). At the same time, each of these risk factors and symptoms are associated with other diseases besides CAD. For example, smoking is a risk factor for CAD, lung cancer, PVD, COPD, head and neck cancer, etc. In order to handle this many-to-many cardinality of the data, multiple base tables should be created and an associative table (junction table) can then be created as needed. The base tables in the above example would be:
 +
{| class="wikitable"
 +
|+ Base Table 1
 
|-
 
|-
| Costochondritis || 1 || 1 || 0 || 0 || 0 || 0 || 0 || 0 || 0 || 0 || 0
+
| Disease (PK)
 
|-
 
|-
| Pneumothorax || 1 || 0 || 0 || 0 || 0 || 0 || 0 || 0 || 0 || 1 || 1
+
| Risk factors
 +
|}
 +
 
 +
{| class="wikitable"
 +
|+ Base Table 2
 
|-
 
|-
| Pneumonia || 1 || 0 || 0 || 0 || 0 || 1 || 1 || 0 || 0 || 0 || 0
+
| Risk factors (PK)
 
|-
 
|-
| Aortic dissection || 1 || 0 || 0 || 0 || 1 || 0 || 0 || 0 || 0 || 1 || 0
+
| Disease
 
|}
 
|}
 +
 +
Disease < Symptom < Disease

Latest revision as of 01:28, 14 February 2023

Database Basics

  • Relational databases (e.g. MySQL) are probably the most common. Essentially they store data in a set of 2D tables (relations) that follow certain rules of normalization and can be linked to each other via relational algebra (mostly set theory type functions).
  • To address problems encountered when trying to interface a RDBMS with an object-oriented programming language, alternative non-relational database structures (e.g. XML, NoSQL, Hierarchical, Network) are available and may have advantages in certain situations.
  • Tables (Relations/Relvars) are made up of Rows (Tuples/Records) and Columns (Attributes/Fields) where the intersection at a certain cell is called a value.
  • Attributes contain values of the same data type (domain), e.g. integer, string, boolean.
  • Each Tuple should have a primary key (normally an arbitrary integer) that identifies a unique tuple. Natural keys (those based on attributes) are possible as well but less optimized.
  • Databases can be indexed to streamline queries.

Useful Resources

Working Prototype

Disease Risk Factors Associated Conditions Complications Symptoms Vitals Signs Labs Imaging Studies
Primary Key LIST of epidemiologic risk factors (age, sex, race, family history) problem LIST at time of presentation LIST of possible complications (screening and anticipatory guidance) LIST of symptoms (keep it broad: chest pain rather than substernal acute pressure-like chest pain with radiation) LIST of vital abnormalities LIST of pertinent positive/negative physical exam findings LIST relevant lab abnormalities LIST of useful imaging findings LIST relevant echo/ECG/PFT/sleep/etc studies
Base Table
Attribute Domain Cardinality
Disease (PK) text --
Risk factors text M2M
Associated Conditions text M2M
Complications text M2M
Symptoms text M2M
Vitals text M2M
Signs text M2M
Labs text M2M
Imaging text M2M
Studies text M2M

Each primary key (disease) can be associated with multiple values in each attribute. For example, CAD is associated with multiple risk factors (age, male gender, smoking, etc) and multiple symptoms (angina, dyspnea, nausea, etc). At the same time, each of these risk factors and symptoms are associated with other diseases besides CAD. For example, smoking is a risk factor for CAD, lung cancer, PVD, COPD, head and neck cancer, etc. In order to handle this many-to-many cardinality of the data, multiple base tables should be created and an associative table (junction table) can then be created as needed. The base tables in the above example would be:

Base Table 1
Disease (PK)
Risk factors
Base Table 2
Risk factors (PK)
Disease

Disease < Symptom < Disease