"segmentation" is a partition of an image into several "coherent" parts, but without any attempt at understanding what these parts represent. One of the most famous works (but definitely not the first) is Shi and Malik "Normalized Cuts and Image Segmentation" PAMI 2000. These works attempt to define "coherence" in terms of low-level cues such as color, texture and smoothness of boundary. You can trace back these works to the Gestalt theory.
On the other hand "semantic segmentation" attempts to partition the image into semantically meaningful parts, and to classify each part into one of the pre-determined classes. You can also achieve the same goal by classifying each pixel (rather than the entire image/segment). In that case you are doing pixel-wise classification, which leads to the same end result but in a slightly different path...
So, I suppose you can say that "semantic segmentation", "scene labeling" and "pixelwise classification" are basically trying to achieve the same goal: semantically understanding the role of each pixel in the image. You can take many paths to reach that goal, and these paths lead to slight nuances in the terminology.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…