We consider the intersection of two research fields: transfer learning and statistics on manifolds. In particu- lar, we consider, for manifold-valued data, transfer learn- ing of tangent-space models such as Gaussians distribu- tions, PCA, regression, or classifiers. Though one would hope to simply use ordinary Rn-transfer learning ideas, the manifold structure prevents it. We overcome this by basing our method on inner-product-preserving parallel transport, a well-known tool widely used in other problems of statis- tics on manifolds in computer vision. At first, this straight- forward idea seems to suffer from an obvious shortcom- ing: Transporting large datasets is prohibitively expensive, hindering scalability. Fortunately, with our approach, we never transport data. Rather, we show how the statistical models themselves can be transported, and prove that for the tangent-space models above, the transport 'commutes' with learning. Consequently, our compact framework, ap- plicable to a large class of manifolds, is not restricted by the size of either the training or test sets. We demonstrate the approach by transferring PCA and logistic-regression models of real-world data involving 3D shapes and image descriptors.