Humans have a significant ability to adapt their skills and use their already gained knowledge in a new situation with different goals or rewards. Such adaptation capability is an important sign of intelligence, but current reinforcement learning agents often perform poorly in similar situations. In this work, we propose a framework to learn a generalizable policy that can efficiently adapt to an unseen task where different tasks only differ in their reward function. Our approach is based on two key components: (a) successor features, a representation scheme that makes it possible to immediately compute the value of a policy on any task, and (b) Robust policy gradient, a generalization of standard policy gradient theorem to find a generalizable policy that can work well on a set of tasks. Putting these two together leads to an approach that integrates naturally into the RL framework and can be applied to all Actor-Critic methods without the need for much change in the original algorithm implementation. We provide our approach in a firm theoretical ground and present experiments that show it successfully promotes transfer in A2C and PPO methods in a sequence of tasks in the Linear Quadratic Regulator environment.