Course Summary

Course Description

Data science and information management are quickly becoming key enablers of economic success for organizations that harness them. Students choosing the DSI concentration will be able to understand, design and exploit the analytic infrastructure and technologies required to deal with massive amounts of data.

Career Possibilities

  • Corporate Data Manager
  • Product Designer
  • Analytics Officer
  • Data Scientist
  • Consumer Demand Forecaster
  • Privacy Officer
  • Editor at Scientific Publication

Major Foundation Requirements

CS300 / Solving Problems with Algorithms

Learn how to design and analyze algorithms used to address complex problems. Solve problems ranging from logistics to route optimization to robotic arm control using algorithms such as hashing, searching, sorting, graph algorithms, dynamic programming, greedy algorithms, divide and conquer, backtracking, random number generation, and randomized algorithms.

CS301 / Bayesian Statistics in Practice

Bayes’ Theorem is a framework for combining prior information with new information to compute a posterior probability distribution. Gain insights into the differences between frequency-based statistics and Bayesian statistics by examining practical cases borrowed from such diverse fields as criminal justice, computer security and epidemiology. Also learn how to apply Bayesian approaches to learning from data

CS302 / Decision Science

Apply formal models of decision making to practical problems involving uncertainty, competition, complex systems, risk aversion, decision biases, and multiple objectives. Follow one principal case study throughout the course — for example, the decisions involved in pursuing the development of a drug. The case study incorporates decision trees, risk-adjusted calculations, optimization, and other methods.

Concentration Core Requirements

CS310 / Machine Learning for Science and Profit

Learn to apply core machine learning techniques — such as classification, perceptron, neural networks, support vector machines, hidden Markov models, nonparametric models of clustering — as well as fundamental concepts such as feature selection, cross-validation and over-fitting. Program machine learning algorithms to make sense of genetic data, perform customer segmentation or predict the outcome of elections.

CS411 / Building Useful and Usable Database Systems

Use data models, data description languages, query methods such as relational algebra and SQL, data normalization, and transaction and security protocols to design an efficient and secure database system for a real-world example. Also explore new trends in databases to sketch out what databases of the future may look like. For example, students examine how graph databases can be leveraged to process social network data or discover relationships between entities, with applications to biological and health care databases.

CS530 / Information Visualization

Examine how to use a portfolio of data and information visualization tools and methods (graphs, dimension reduction methods, interactive visualization, network representations, and text visualization) for data exploration, visual analytics and information communication. Create a unique set of visualizations based on a multifaceted data set — for example, hospital performance, GDP per capita over time, city size and politics, or spread of infectious disease — and use the visualizations to gain insights from the data and communicate a compelling story.

Concentration Electives

CS313 / Applications of Text Mining and Computational Linguistics

Investigate how to apply text mining and classification approaches (tokenization, parts-of-speech tagging, stemming, computational semantics, lexical semantics, Bayes networks, latent semantic indexing, clustering, and support vector machines) to problems such as spam filtering, social media monitoring and text summarization.

CS331 / Time Series, Signal Processing and Image Processing

In the first part of this course, become familiar with the use of signal representation and processing methods (sampling, quantization, Fourier analysis, discrete Fourier transform, time-frequency analysis, and autoregressive modeling). Example applications are drawn from digital sound processing, consumer demand forecasting or brain signal analysis. In the second part, explore how to process digital images as two-dimensional signals in biomedical imaging and satellite imagery for agriculture.

CS431 / Information Management and Policy

Learn to recognize and address issues of privacy, security, equity, and intellectual property in information systems through the design and implementation of sound information management policies. Students design — and plan the deployment of — a set of information management policies for a multinational organization, with access to its users’ search data.

CS432 / Geospatial Data Analysis

Explore how to apply the mathematical and computational techniques specific to spatial and geographic data analysis. Using a geographic information system (GIS) to create a geolocation-based application, justify why geolocation is key to the success of the application. Also create a map-based story on a topic such as migration patterns, climate change or political landscapes.

CS433 / Advanced Decision Science

Use advanced decision techniques such as real options and Monte Carlo simulation to address complex issues. Examples include project portfolio management, pharmaceutical drug development, and oil and gas investment decisions. Learn about philanthropic portfolio decisions requiring high-stake tradeoffs in highly uncertain environments and with complex, culturally and socially sensitive objectives.

CS531 / Network Science

Understand how to use network concepts — from graph theory, mathematics, probability theory, and statistical physics — to analyze and predict the behavior of social, economic and transportation networks. Utilize properties such as centrality, diameter and effective distance; for example, model and predict the spread of an infectious disease and design interventions that can slow down or stop the epidemic.

CS532 / Real-Time Data Analysis

Harness the technologies that enable the scalable management of vast quantities of data collected through real-time and near-real-time sensing in the physical, biological and social domains. Explore alternatives to traditional databases to address the specific issues of real-time data processing and analytics, and apply knowledge to cyber-security (network data), sentiment analysis (social media data) or traffic prediction (traffic data).

Request further information

  • CAPTCHA Image New Image
  • Agree to Terms & Conditions