Skip to content

Understanding Text Style Transfer

Have you ever thought about writing as if you were another person?


In middle school, we all struggle to write formal essays, having the teachers emphasizing all language features that are required. At work, we need to write formal reports to demonstrate our professionalism. And in free time, we love reading or writing Twitters, blogs and many others. In these social media, we also enjoy modulating our tones, — anyway, it is a lot of fun to express things in an unusual way!


Text style converter (try it now!) is based on the newest AI technology and will transfer your words to another style.

Understanding Text Style Transfer


Text style transfer targets at rewriting sentences from one source style to another target style, while keeping the semantic contents. For example, a formal expression of “that stuff is so damn cool” can be “That thing is very impressive.”



The nature of this task is quite challenging in that it is related to a broad range of linguistic phenomena, such as syntactic simplification, word substitution, and sentiment manipulation.


Seq2Seq Translation

If provided a large human-annotated dataset, the most effective method for style transfer is form it as a Sequence-to-Sequence (Seq2Seq) translation problem.


Assume we have two style corpora X and Y. The common practice is to take out a pair of sentence (x, y) where x is from the corpus X and y is from Y. For each style pair, the Seq2Seq model is trained to generate y’ based on the seen x and make y’ as close to as possible.


Problem of Lack of Annotated Data

The problem of the Seq2Seq approach is that we need thousands of pairs of sentences. In practice, at least 100K sentences are needed for a quality transfer. The more pairs, the more accurate, –because neural models does not extrapolate well if there are unseen cases. Therefore, the Seq2Seq model relies on human annotators to pair up the sentences for every two styles. This hard requirement does not quite apply to real life applications.


However, there are millions of billions of text available out there. So the emergence of models independent from parallel datasets becomes a trend (Shen et al., 2017; Fu et al., 2017; Li et al., 2018; Prabhumoye et al., 2018; Santos et al., 2018).


Solution: IMT

In our newest paper, we propose a state-of-the-art model on unsupervised text style transfer. In light of the large amount of non-parallel style corpora, our method breaks out of the common practice.


Peer work for unsupervised text style transfer:

– Trend 1: GAN to disentangle styles from content (Shen et al., 2017; Fu et al., 2017; Prabhumoye et al., 2018; Santos et al., 2018)

– Trend 2: Heuristic substitution (Li et al., 2018)

– Trend 3: Improving GAN by Language Modelling or Back Translation (Yang et al., 2018; Prabhumoye et al., 2018)


However, we propose a new perspective into this task by constructing pseudo-parallel corpora and iteratively seeking for quality transfer.


Iterative Matching and Translation (IMT)

Our method works as follows:


– Step 1: We first create an initial pseudo-parallel corpus by matching every source sentence x with a target sentence y which is the most similar to it according to their sentence embeddings.

– Step 2: We then train a Seq2Seq model according to this pseudo-parallel corpus (X, Y’), and collect the resulted (X, Y”). We refine the pseudo parallel corpus by picking the ŷ from (y’, y”) with the nearer Word Mover Distance to x.

– Step 3: We re-construct the pseudo-parallel corpus by matching the updated ŷ with its most similar target in Y. From now on, we start the iterative process of matching and translation, and run the model until the update rate is trivial.


The refining effect of our method is obvious (see example below).

State-of-the-Art Results

Our method achieves the state-of-the-art performance on two typical tasks of style transfer.


The first task is sentiment modification, which targets at changing the polarity of sentiments, such as changing “The shop never answers our calls” to “The shop is very responsive to our calls.” Our model outperforms all previous methods by a large margin of 6.5%.


Another task manipulates the formality of text. For example, it changes the style of writing from informal to formal, or vice versa. This functionality is useful for both daily writing and professional needs. On this task, our model demonstrates a larger improvement of 22.0% over all state-of-the-art models.



Shall you have interest in our model, please feel free to




Leave a Reply

Your email address will not be published. Required fields are marked *