.. _fine_tuning: Fine-Tuning for Downstream Tasks ================================ All Word2Vec and transformer-based embeddings and any embedding followed with an ``svd`` transformation are fine-tunable for downstream tasks. In other words, if you pass the resulting fine-tunable embedding to a PyTorch training method, the features will automatically be trained for your application. The fine-tuning is disabled by default. To activate it, turn the ``is_finetuneable`` parameter on and specify a torch ``dtype`` as the type of the output. .. code-block:: python emb = TextWiser(Embedding.Word(word_option=WordOptions.word2vec), Transformation.Pool(pool_option=PoolOptions.max), is_finetuneable=True, dtype=torch.float32) Notice also that setting the ``sparse`` parameter of a ``WordOptions.word2vec`` model to ``True`` can yield significant speedup during training. Currently, ``optim.SGD`` (CUDA and CPU), ``optim.SparseAdam`` (CUDA and CPU), and ``optim.Adagrad`` (CPU) support sparse embeddings. :repo:`Fine Tuning Example ` shows the ability to fine-tune embeddings, and how they can improve the prediction performance.