Place recognition is a complex process involving idiothetic and allothetic information. In mammals, evidence suggests that visual information stemming from the temporal and parietal cortical areas ('what' and 'where' information) is merged at the level of the entorhinal cortex (EC) to build a compact code of a place. Local views extracted from specific feature points can provide information important for view cells (in primates) and place cells (in rodents) even when the environment changes dramatically. Robotics experiments using conjunctive cells merging 'what' and 'where' information related to different local views show their important role for obtaining place cells with strong generalization capabilities. This convergence of information may also explain the formation of grid cells in the medial EC if we suppose that: (1) path integration information is computed outside the EC, (2) this information is compressed at the level of the EC owing to projection (which follows a modulo principle) of cortical activities associated with discretized vector fields representing angles and/or path integration, and (3) conjunctive cells merge the projections of different modalities to build grid cell activities. Applying modulo projection to visual information allows an interesting compression of information and could explain more recent results on grid cells related to visual exploration. In conclusion, the EC could be dedicated to the build-up of a robust yet compact code of cortical activity whereas the hippocampus proper recognizes these complex codes and learns to predict the transition from one state to another.