Machine learning provides effective computational tools for exploring the chem. space via deep generative models. Here, we propose a new reinforcement learning scheme to fine-tune graph-based deep generative models for de novo mol. design tasks. We show how our computational framework can successfully guide a pretrained generative model toward the generation of mols. with a specific property profile, even when such mols. are not present in the training set and unlikely to be generated by the pretrained model. We explored the following tasks: generating mols. of decreasing/increasing size, increasing drug-likeness, and increasing bioactivity. Using the proposed approach, we achieve a model which generates diverse compounds with predicted DRD2 activity for 95% of sampled mols., outperforming previously reported methods on this metric.