I will discuss how data sets can be modeled as weighted graphs and studied through diffusions or random walks on it. These diffusions, and in particular the associated heat kernels, can be used to study certain geometric properties of the graph, as well as finding bi-Lipschitz parametrizations (mapping geodesic distances on the set to Euclidean distances in parameter space) on large chunks of the data, at least when the data is "manifold-like". This is useful for dimensionality reduction, visualization and interactive learning. Moreover, and most importantly, from the diffusion operators one can construct dictionaries of functions that are adapted to the data, generating Fourier-like basis functions (eigenfunctions of a Laplacian) or wavelet-like functions (called diffusion wavelets). These dictionaries can be used to analyze and approximate functions on graphs, and to perform learning tasks, or signal processing tasks such as compression and denoising. We present several examples, from semisupervised learning to image denoising and document organization.