Yes, that is totally possible.
Your outline for it won't work as a efficiently however. Here's why:
So "Brisk" as a whole is a package. It contains any number combination of vanilla Cassandra nodes and Brisk nodes, but in order to simplify things, we will call them vanilla nodes and tt (task tracker) nodes.
Vanilla nodes contain just Cassandra. TT nodes contain Cassandra and also run task trackers. In order to help explain what exactly the snitch does, I'll go through and explain why we use a DCs in Brisk.
So if we connect 12 nodes via SimpleSnitch, we will form one ring. This one ring will be split up into 12 spots. We will then go and change the snitch to be EC2Snitch. This will keep the same 12 nodes on the same 12 spots of the token, but since we have the EC2Snitch, it will automatically set 4 of those nodes to be on another "DC" as according to Cassandra since.. they are on another DC.
With the cluster set like this, we are able to easily set different options as far as data replication/safe-keeping is concerned. For example, every time data gets written to the east coast, send one copy to the west coast. When the east coast has a power outage, we have all the data still operable on the west coast.
So, with the BriskSnitch, given 12 nodes, all running vanilla nodes. All nodes will appear in the same cluster. If you were to not start all 12 nodes as vanilla nodes, and instead did 8 vanilla and 4 tt nodes, we would get 8 nodes in one "DC" and 4 in another "DC" because Cassandra sees them as such, not because they actually ARE in physical DCs. But, if they were in different DCs then the result would be exactly the same.
Because we tricked Cassandra however, we are able to setup the cluster easily to always send data to the Brisk DC as well. This way both parts of the same cluster, have the same information. The clients will read and write to the vanilla nodes and all the heavy analytics will fall on the tt nodes and all the while, we don't have to make sure the tt nodes have the most up to date information, since this is built-in power available at the multiple DC level.
So, when you say you want the 4 Brisk nodes to have a separate ring, you can do that, but also visualize them being a separate cluster. You can then configure your client to always write to both the vanilla cluster and the tt cluster, this way you can run analytics with real-time data. Or you can follow the above setup, to take the load off the client on replicating to two different clusters since this is already an innate function of Cassandra DataCenter setup.
So as far as your concern for having a production ready HA/DR, this is exactly what you would have if 4 nodes were on a physically different datacenter. The BriskSnitch just used Cassandra's powerful innate HA/DR and watered it down for innate replication management, but it's still the same powerful tool as soon as physical datacenters are used.
See http://www.datastax.com/docs/0.8/operations/datacenter for more information.
Quick note on tokens:
You want the tokens to be equally split on each DC since each DC will run it's own replication algorithms. If on every conflict, you offset one ring by one until you have no conflicts, then this will work 100% fine. But if you ever plan on using OpsCenter, it's just more visually pleasing to see the ring nicely split altogether.
Actually, for a 2 DC ring splitting algorithm see: https://github.com/riptano/BriskClusterAMI/blob/master/tokentool.py