Multi-core execution parallelizes the program along different instruction streams. Superscalar execution parallelizes the program along independent instructions from the same instruction stream. Therefore conceptually they are orthogonal.
Another review question: On most systems today, "who" is responsible for identifying the kind of parallelism in a program that is implemented via multi-core processing. What about super-scalar processing?
Usually programmers or compilers are responsible for identifying the parallelism via multi-core processing. But the parallelism via super-scalar processing is found by the hardware (CPU itself).
I think it is because the hardware can only use local information (instructions close to each other) to figure out parallelism. But programmers and compilers can view it from a wider range, such as the property of a task or the structure of a program.
The hardware is responsible for finding independent instructions to execute in parallel, in the case of superscalar processing. However, there are architectures where the compiler can be used instead to find and schedule independent instructions, e.g., VLIW (Very Large Instruction Word) architectures.
Compilers can help with super-scalar processing as well. They can schedule instructions such that most ILP is discovered by the core. For example, it would make sense to place long latency instructions a few instructions ahead of their dependents.