The traditional meaning of "data center" is a large building (or at least a large portion of a floor of a building) dedicated to computing infrastructure. Think Google, Amazon, Microsoft, Facebook. A football-sized room or collection of rooms with many rows of many racks of computers, network routers, and storage systems. Traditionally, one "main" data center and one or more "backup" data centers.
Modern data centers are implicitly distributed data centers. None is "main"; they are all peers. More than just capacity, reliability or high availability is the goal.
In either case, the goal is for an organization's computing services to be available despite natural and man-made disasters.
So, system administrator's need to be sure that a given application is available in multiple data centers.
Amazon in particular also uses the term "availability zones" to refer to real data centers.
Cassandra and DSE are designed for reliability and high availability. That means they need to support "multiple data centers", in that modern, distributed sense. These are physical or "real" or "actual" data centers.
Separate from this concept of hardware reliability and distributed data centers, DSE has introduced the concept of a "workload", and this is where "virtual data center" comes in.
The basic idea is that different workloads, such as Hadoop analytics, Cassandra real-time database access, and Solr rich search have rather different computing requirements and resource utilization patterns, such that mixing these rather distinct workloads is not such a great idea.
For better or worse, DSE uses the term "data center" or DC to allow the system administrator to segregate workloads so that workloads do not interfere with each other.
But, now we have two distinct uses of the term "data center" - real, physical data centers (such as Amazon availability zones) and logical or "virtual" segregation of nodes in a cluster based on workload.
Generally, in the context of discussion of DSE data center is implicitly logical or virtual data center (workload partition), but in the interest of clarity, we tend to try to be specific as to whether we are referring to data center in the high availability sense or the workload partition/segregation sense.
The important thing is that DSE supports both.
At least for now, segregation of nodes by workload is NOT performed automatically by DSE. It is an explicit, manual administration responsibility.
Yes, we should clarify the doc. Thanks for the hint.