# Data Science Projects

## Data Science Projects                             &#x20;

### 1. Predicting Stock Performance with Machine Learning

&#x20;*Project type: Group Project                            **University of Liverpool** &#x20;*&#x20;                                                 &#x20;

Using **Python** language and **Packages**

**Aim** of the project: Maximize the return of shareholders by predicting the price of bitcoin using machine and deep learning techniques

![](https://2672152490-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-McP3ju8T9EY2fcYeMRX%2F-McP56sHyIy1DH74Foti%2F-McP5h-K4rCv4po07Y-Q%2FMicrosoftTeams-image%20\(4\).png?alt=media\&token=5ccf1312-1c56-4b6f-9f7d-7d4a8f24e29f)

![](https://2672152490-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-McP3ju8T9EY2fcYeMRX%2F-McP56sHyIy1DH74Foti%2F-McP5iGiBeOE8qT5bhsc%2FETH%20tRAIDING.png?alt=media\&token=e40a37dc-afbd-4615-97b0-2de11c758c38)

### 2. **Working on Credit Card Fraud Prediction using Machine Learning and Artificial Intelligence**&#x20;

#### *Currently working on this***..............................................................>**          &#x20;

## BIG-Biological Data Projects&#x20;

### **1.** Rheumatoid Arthritis---Data Analysis using GWAScat

> Using **R** language&#x20;

```r
#For GWAS  analysis we need BiocManager package to be installed 
install.packages("BiocManager")
```

```r
BiocManager::install("ggbio")
BiocManager::install("gwascat")
BiocManager::install("Homo.sapiens")
```

```r
library(gwascat)
objects("package:gwascat")
data(ebicat38)
topTraits(ebicat38)
subsetByTraits(ebicat38, tr="Rheumatoid arthritis")[1:294]
df_main <- data.frame(subsetByTraits(ebicat38, tr="Rheumatoid arthritis")[1:294])
getwd()
write.csv(df_main,"Gwas_RA_ALL.csv", row.names = FALSE)
```

```r
#Basic Manhattan plot
 gwtrunc = ebicat38
 requireNamespace("S4Vectors")
 mcols = S4Vectors::mcols

 mlpv = mcols(ebicat38)$PVALUE_MLOG
 mlpv = ifelse(mlpv > 1000, 1000, mlpv)
 S4Vectors::mcols(gwtrunc)$PVALUE_MLOG = mlpv
 library(GenomeInfoDb)
 seqlevelsStyle(gwtrunc) = "UCSC"
 gwlit = gwtrunc[ which(as.character(seqnames(gwtrunc)) %in% c("chr1")) ]
 library(ggbio)
 mlpv = mcols(gwlit)$PVALUE_MLOG
 mlpv = ifelse(mlpv > 550, 550, mlpv)
 S4Vectors::mcols(gwlit)$PVALUE_MLOG = mlpv
 methods:::cbind2(FALSE)
 autoplot(gwlit, geom="point", aes(y=PVALUE_MLOG), xlab="chr1" )
```

After running this code you get the plot shown below

![](https://2672152490-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-McP3ju8T9EY2fcYeMRX%2F-McP56sHyIy1DH74Foti%2F-McP5mEbnfulnuPB6qN8%2Fchr1%20\(1\).jpg?alt=media\&token=f328caf7-0aa8-4fc1-9bee-773611abd3e8)

### **2. Plotting of Biological Data on the World Map -------->  Density-based Mapping**&#x20;

```r
#First install Maps package
install.packages("maps")
```

```r
library(maps)
library(ggplot2)
```

```r
world_map <- map_data("world")
p <- ggplot() + coord_fixed() +
  xlab("") + ylab("")

```

```r
#Add map to base plot
base_world_messy <- p + geom_polygon(data=world_map, aes(x=long, y=lat, group=group), 
                                     colour="darkslategrey", fill="white")

base_world_messy

```

![](https://2672152490-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-McP3ju8T9EY2fcYeMRX%2F-McP56sHyIy1DH74Foti%2F-McP5f4m7W9a97f3ysjw%2FMAPO.png?alt=media\&token=aff1b10e-14d4-4dc7-863a-6c3ef7a8d8a0)

```r
#Strip the map down so it looks super clean (and beautiful!)
cleanup <- 
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(), 
        panel.background = element_rect(fill = 'darkslategrey', colour = 'darkslategrey'), 
        axis.line = element_line(colour = "white"), legend.position="none",
        axis.ticks=element_blank(), axis.text.x=element_blank(),
        axis.text.y=element_blank())

base_world <- base_world_messy + cleanup

base_world
```

![](https://2672152490-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-McP3ju8T9EY2fcYeMRX%2F-McP56sHyIy1DH74Foti%2F-McP5du0B00Lh7fkx352%2FMAPOO.png?alt=media\&token=3b563f24-359f-420b-8a7c-908d635460dc)

```r
# read file  with cordinate data 
df <- read.csv("C:/Users/workc/OneDrive/Desktop/RA_map_cord.csv")
```

```r
map_data_sized <- 
  base_world +
  geom_point(data=df, 
             aes(x=long, y=lat, size=value), colour="Black", 
             fill="Deep Pink",pch=21, alpha=I(1)) 

map_data_sized
 
```

![](https://2672152490-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-McP3ju8T9EY2fcYeMRX%2F-McP56sHyIy1DH74Foti%2F-McP5aLtGFaeXBgz7R0l%2FMAPOOO.png?alt=media\&token=93f47137-0dea-4cc3-91af-37dc66e9f2ad)

Disease mapping on the world map by countries&#x20;

(**Improved** version from the **above map**)

![](https://2672152490-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-McP3ju8T9EY2fcYeMRX%2F-McP3pEpV5z4dEbkbxRt%2F-McP4QHjdzz9cR3nI_mC%2FWmap_RA_GB.png?alt=media\&token=2fc43e9a-75e2-4c94-8dbe-c2c2801e5c14)

## Data Visualisation Gallery&#x20;

#### **`Data visualization outputs are included below.`**&#x20;

> Using **Power-BI**

![](https://2672152490-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-McP3ju8T9EY2fcYeMRX%2F-McP56sHyIy1DH74Foti%2F-McP5lAn-Mo5TwHvw46o%2FCVD_DIST.svg?alt=media\&token=9c5d476a-e2ff-4746-b0a3-0075b74e97e8)

> Using **Power-BI**

![](https://2672152490-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-McP3ju8T9EY2fcYeMRX%2F-McP3pEpV5z4dEbkbxRt%2F-McP4VU-vR4jTNCxu6yJ%2Fg11626.png?alt=media\&token=6d070c1a-0ea6-4cbf-a8a9-b1ee4280f786)

### 1. Protein Database analysis&#x20;

Generated visualisations are given below:

> Created using **R programming** language&#x20;

![](https://2672152490-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-McP3ju8T9EY2fcYeMRX%2F-McP3pEpV5z4dEbkbxRt%2F-McP4JUjlkiwnDuCzA03%2FFUll_Final.png?alt=media\&token=cad5aad5-a7ce-483d-b002-31f1f01c2120)

> Creating using **R programming** language&#x20;

![](https://2672152490-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-McP3ju8T9EY2fcYeMRX%2F-McP3pEpV5z4dEbkbxRt%2F-McP4SJWbs2DOG5wbYPY%2FHD_main_Final.png?alt=media\&token=6e62392f-09e3-46fe-b23c-b9275de47b50)

> Created using **R programming** language

![](https://2672152490-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-McP3ju8T9EY2fcYeMRX%2F-McP3pEpV5z4dEbkbxRt%2F-McP4QHjdzz9cR3nI_mC%2FWmap_RA_GB.png?alt=media\&token=2fc43e9a-75e2-4c94-8dbe-c2c2801e5c14)

> Created using **Power-BI**

![](https://2672152490-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-McP3ju8T9EY2fcYeMRX%2F-McP56sHyIy1DH74Foti%2F-McP5p9Z8tXBd8cL4zmC%2F100_400-1.jpg?alt=media\&token=63127da1-af40-44c3-8b86-8231778ab4d3)

> Created using **Power-BI**

![](https://2672152490-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-McP3ju8T9EY2fcYeMRX%2F-McP56sHyIy1DH74Foti%2F-McP5niSb5_bCI8015ae%2F400_1020-1.jpg?alt=media\&token=2b2544e1-eedb-4430-8fff-dddb3ab1bd19)

> Created using **Power-BI**

![](https://2672152490-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-McP3ju8T9EY2fcYeMRX%2F-McP5rvzkdQYHeC8uAPT%2F-McP75FCWjno2uwBSb8p%2F10_30-1.jpg?alt=media\&token=3ef74ca9-3e33-4a0e-891b-f4bd3a507091)

> Created using **Power-BI**

![](https://2672152490-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-McP3ju8T9EY2fcYeMRX%2F-McP56sHyIy1DH74Foti%2F-McP5qRx8VcO0nm1UJrz%2F1_10-1.jpg?alt=media\&token=48a65933-bc39-4a63-84b4-15499a83841b)
