class: center, middle, inverse, title-slide # Welcome to Data Science
🙌 ### Dr. Çetinkaya-Rundel --- layout: true <div class="my-footer"> <span> Dr. Mine Çetinkaya-Rundel - <a href="https://introds.org" target="_blank">introds.org </a> </span> </div> --- class: center, middle # Hello world! --- ## What is data science? -- - Data science is an exciting discipline that allows you to turn raw data into understanding, insight, and knowledge. - We're going to learn to do this in a `tidy` way -- more on that later! --- # What is this course? This is a course on introduction to data science, with an emphasis on statistical thinking. -- **Q - What data science background does this course assume?** A - None. -- **Q - Is this an intro stat course?** A - While statistics `\(\ne\)` data science, they are very closely related and have tremendous of overlap. Hence, this course is a great way to get started with statistics. However this course is **not** your typical high school statistics course. -- **Q - Will we be doing computing?** A - Yes. --- # What is this course? **Q - Is this an intro CS course?** A - No, but many themes are shared. -- **Q - What computing language will we learn?** A - R. -- **Q: Why not language X?** A: We can discuss that over ☕. --- ## Where is this course? <br><br><br><br> .large[ .center[ [introds.org](https://introds.org) ] ] --- ## Who am I? .pull-left[ <img src="img/mine.png" width="70%" style="display: block; margin: auto;" /> ] .pull-right[ .midi[ - **Office:** 2257 JCMB - **Student hours:** Tuesdays 14:30 - 16:30 and by appointment - **Email:** mine.cetinkaya-rundel@ed.ac.uk *(but post course questions on Piazza!)* - Where else can you find me? - **Twitter:** [@minebocek](https://twitter.com/minebocek) - **GitHub:** [mine-cetinkaya-rundel](https://github.com/mine-cetinkaya-rundel) - **Blog:** [citizen-statistician.org](http://citizen-statistician.org) ] ] --- class: center, middle # Data in the wild --- # The US of Bey <img src="img/brooke-watson-us-of-bey.gif" width="65%" style="display: block; margin: auto;" /> Brooke Watson, [blog.brooke.science/posts/the-us-of-bey](https://blog.brooke.science/posts/the-us-of-bey/) --- # Punctuation in literature <img src="img/julia-silge-punctuation.png" width="48%" style="display: block; margin: auto;" /> Julia Silge, [juliasilge.com/blog/punctution-literature](https://juliasilge.com/blog/punctution-literature/) --- # Text analysis of Trump's tweets <img src="img/david-robinson-trump-tweets.png" width="63%" style="display: block; margin: auto;" /> David Robinson, [varianceexplained.org/r/trump-tweets](http://varianceexplained.org/r/trump-tweets) --- ## Greatest Twitter scheme of all time <img src="img/bohemian-rhapsody.png" width="43%" style="display: block; margin: auto;" /> .small[ [gist.github.com/mine-cetinkaya-rundel/03d7516dea1e5f2613a5d71c28edb08d](https://gist.github.com/mine-cetinkaya-rundel/03d7516dea1e5f2613a5d71c28edb08d) ] --- ## Voting patterns in the UN <img src="img/unvotes.png" width="100%" /> [minecr.shinyapps.io/unvotes](https://minecr.shinyapps.io/unvotes/) --- class: center, middle # Your turn! --- ## Launch RStudio .instructions[ Go to [introds.org](https://introds.org), and launch RStudio. ] - When you click on RStudio this will first take you to Learn so that you can log in. - Then, in Learn, click on the RStudio link, which will try to open a new window in your browser, say yes. - In the next window, select RStudio under "Start Personal Notebook", and hit Start. - In the next window, on the top right, click on New, and choose RStudio again. And voila! --- ## Get code from GitHub .instructions[ Go to [introds.org](https://introds.org), grab the code you need from GitHub, and bring it in to RStudio. ] - From [introds.org](https://introds.org) navigate to the course GitHub organization. - Click on the repository called [`ex-01-unvotes`](https://github.com/ids-s1-19/ex-01-unvotes). - Click on the green button on the center right that says "Clone or download" and copy the repository URL by clicking on the clipboard icon. Make sure close with HTTPS is selected (always). - Go back to RStudio, click on File -> New Project -> Version Control -> Git: - Repository URL: Paste repo url here - Click tab and project name will be filled in for you - Create Project --- ## Create your first data visualization .instructions[ Modify the visualization code for a country of your own choice, and discuss your findings with your neighbor. ] - In RStudio, locate the Files pane in the bottom right corner. - In the Files pane, spot the file called `ex-01-unvotes.Rmd`. Open it, and then click on the "Knit" button. (It might ask you to install some packages, say yes.) - Go back to the file and change your name on top (in the `yaml` -- we'll talk about what this means later) and knit again. - Then change the country names to those you're interested in. Your spelling and capitalization should match how the countries appear in the data, so take a peek at the Appendix to see how the country names are spelled. Knit again. And voila, your first data visualization! --- ## Recap - A quick overview of R vs. RStudio and git vs. GitHub - A peek at R code for creating a complex data visualization - Course technology kit - Hands on practice with creating data visualizations with R --- ## Homework (due tomorrow) Should take <15 mins! - Read the course syllabus and tips for getting help at [introds.org](https://introds.org) - Sign up for Piazza (linked from the help page on the course website) --- ## Course overview <br><br><br><br> .large[ .center[ [introds.org](https://introds.org) ] ]