SQL for Data Science
Querying, Designing, and Deploying Databases
Welcome
Welcome to SQL for Data Science, the free online textbook that aims to help students, researchers, and professional data scientists understand the world of SQL and relational databases!
In this book, you will learn how to explore and interact with databases, extract insights from them, and build your own databases for storing and sharing knowledge.
This text is aimed at beginning learners who want to master the basics of SQL and database programming. You may be a student whose class is using this book as a textbook, a data analyst who needs to use SQL for your job, or an experienced professional who wants to master new skills and broaden your horizons. In any case, you should find this book to be an accessible (and free!) way to meet your learning objectives.
Acknowledgments
Significant portions of this textbook were written with agentic AI assistance, specifically Anthrophic’s Claude Code platform. This obviously helps tremendously with efficient content generation, but a bigger reason I chose this path is how it streamlines some of the more tedious parts of writing a technical textbook. For example, the entire book centers on a Docker image I created which has all of the databases we use pre-installed on it, and agentic AI was able to automatically run the Docker image and execute SQL queries against these databases to validate all the code in the book. This helps ensure consistency and accuracy in a matter of minutes, and at a level that would take many hours for a human reviewer to match.
That being said, this book has undergone thorough and painstaking human review by myself and my team. On that note, I want to give a profound THANK YOU to my technical reviewers, Taehee Kim and Patrick Grimes, two of my Data Science advisees here at Davidson. I have been incredibly fortunate to have their help on this undertaking which, without their acumen and insight, would have been monumentally more difficult.
I also need to thank the Duke Endowment, whose generosity funds my work and creates opportunities for my students to gain meaningful experience in data science and data engineering.
Lastly, I must thank my wife and partner, Megan. I was already an established data engineering professional when I answered Davidson’s call to teach, and becoming a professor was a major change in the nature of my work and the stress levels it generates. The first few semesters were especially brutal, and she was very skeptical of my idea to write a textbook because of the amount of work it would require of me. I’ve been able to pull this off in large part thanks to her steadfast love and patience. Indeed, she is my better half.
