Learning to read involves setting up associations between meaningless visual inputs (V) and their phonological representations (P). Here, we recorded the brain signals (ERPs and fMRI) associated with phonological recoding (i.e., V-P conversion processes) in an artificial learning situation in which participants had to learn the associations between 24 unknown visual symbols (Japanese Katakana characters) and 24 arbitrary monosyllabic names. During the learning phase on Day 1, the strength of V-P associations was manipulated by varying the proportion of correct and erroneous associations displayed during a two-alternative forced choice task. Recording event related potentials (ERPs) during the learning phase allowed us to track changes in the processing of these visual symbols as a function of the strength of V-P associations. We found that, at the end of the learning phase, ERPs were linearly affected by the strength of V-P associations in a time-window starting around 200 ms post-stimulus onset on right occipital sites and ending around 345 ms on left occipital sites. On Day 2, participants had to perform a matching task during an fMRI session and the strength of these V-P associations was again used as a probe for identifying brain regions related to phonological recoding. Crucially, we found that the left fusiform gyrus was gradually affected by the strength of V-P associations suggesting that this region is involved in the brain network supporting phonological recoding processes.