Datastax PHP Driver: 1.1 GA Released!

By Michael Penick -  February 11, 2016 | 2 Comments

We are pleased to announce the 1.1 GA release of the PHP driver for Apache Cassandra. This release includes all the features necessary to take full advantage of Apache Cassandra 2.1 including support for tuples, user defined types (UDTs), nested collections, client-side timestamps, and binding named arguments when using simple statements. In addition to supporting Cassandra 2.1 features the release also brings with it support for PHP 7, retry polices, raw paging token access, and the ability to disable schema metadata. Example code for all the new features found in this release can be found in the features directory in the driver's source code.

What's new

Support for PHP 7

The PHP driver can now be used with PHP 7! The driver will still continue to work for officially supported versions of PHP 5.

User Defined Types

User defined types (UDTs for short), introduced in Apache Cassandra 2.1, allow for creating arbitrarily nested, composite data types with multiple fields in a single column. This can be useful for simplifying schema by grouping related fields into a single UDT instead of using multiple columns. More information about using UDTs can be found in this post.

Inserting a user defined type

$cluster = Cassandra::cluster()->build();
$session = $cluster->connect("music");

$statement = new Cassandra\SimpleStatement(
    "CREATE TYPE IF NOT EXISTS song_metadata (duration int, bit_rate set<text>, encoding text)");
$session->execute($statement);

$statement = new Cassandra\SimpleStatement(
    "CREATE TABLE IF NOT EXISTS songs (id uuid PRIMARY KEY, name text, metadata frozen<song_metadata>)");
$session->execute($statement);

# The UDT can be retrieved from the schema metadata
$songMetadataType = $session->schema()->keyspace("music")->userType("song_metadata");

# Construct a UDT value from the UDT type
$songMetadata = $songMetadataType->create(
    "duration", 180,
    "bit_rate", Cassandra\Type::set(Cassandra\Type::text())->create("128kbps", "256kbps"),
    "encoding", "mp3");

# Bind and and execute using the constructed UDT value
$statement = new Cassandra\SimpleStatement("INSERT INTO songs(id, name, metadata) VALUES (?, ?, ?)");
$options = new Cassandra\ExecutionOptions(
    array(
        "arguments" => array(
            new Cassandra\Uuid(),
            "Some Song",
            $songMetadata
        )
    )
);

$session->execute($statement , $options);

User defined types can also be constructed programatically. This can be useful for instance where schema metadata is disabled or is unavailable.

$songMetadataType = Cassandra\Type::userType(
    "duration", Cassandra\Type::int(),
    "bit_rates", Cassandra\Type::set(Cassandra\Type::text()),
    "encoding", Cassandra\Type::text()
);

$songMetadata = $songMetadataType->create(
    "duration", 180,
    "bit_rate", Cassandra\Type::set(Cassandra\Type::text())->create("128kbps", "256kbps"),
    "encoding", "mp3");

# ...

Tuples

Tuples, also introduced in Apache Cassandra 2.1, are useful for creating positional, fixed length sets with mixed types. They're similar to UDTs in that they are arbitrary composite type. However, tuple fields are unnamed, therefore its fields can only be referenced by position. This also means that it is not possible to add new fields to a tuple.

$cluster = Cassandra::cluster()->build();
$session = $cluster->connect("music");

$statement = new Cassandra\SimpleStatement(
    "CREATE TABLE IF NOT EXISTS songs_using_tuple (id uuid PRIMARY KEY, name text, metadata tuple<int, frozen<set<text>>, text>)");
$session->execute($statement);

# Create a new tuple type
$songMetadataType = Cassandra\Type::tuple(
    Cassandra\Type::int(),
    Cassandra\Type::set(Cassandra\Type::text()),
    Cassandra\Type::text()
);

# Construct a tuple value using the tuple type
$songMetadata = $songMetadataType->create(
    180,
    Cassandra\Type::set(Cassandra\Type::text())->create("128kbps", "256kbps"),
    "mp3"
);

# Bind and and execute using the constructed tuple value
$statement = new Cassandra\SimpleStatement("INSERT INTO songs_using_tuple (id, name, metadata) VALUES (?, ?, ?)");
$options = new Cassandra\ExecutionOptions(
    array(
        "arguments" => array(
            new Cassandra\Uuid(),
            "Some Song",
            $songMetadata
        )
    )
);

$session->execute($statement , $options);

Nested Collections

Lists, maps, and sets types can now be arbitrarily nested. Other collections can even be keys in maps and sets.

use Cassandra\Type;
use Cassandra\Decimal;

$setType = Type::set(Type::int());
$map = Type::map($setType, Type::text())->create(
    $setType->create(1, 2, 3), "abc",
    $setType->create(4, 5, 6), "xyz"
);

echo "The value of {4, 5, 6} is : " . $map->get($setType->create(4, 5, 6)) . "\n"; # "xyz"

$listType = Type::collection(Type::decimal());
$set = Type::set($listType)->create(
    $listType->create(new Decimal("0.0"), new Decimal("1.0")),
    $listType->create(new Decimal("2.0"), new Decimal("3.0"), new Decimal("4.0"))
);

if ($set->has($listType->create(new Decimal("2.0"), new Decimal("3.0"), new Decimal("4.0")))) {
    echo "Yup! It's in there.\n";
}

Client-side Timestamps

Apache Cassandra uses timestamps to serialize write operations. That is, values with a more current timestamp are considered to be the most up-to-date version of that information. Previous versions of the PHP driver only allowed timestamps to be assigned server-side by Cassandra. This is not always ideal for all applications. This release of the driver allows timestamps to be generated client-side and it is enabled by either setting a global timestamp generator or assigning a specific timestamp to a statement or batch. By default, the driver uses a server-side timestamp generator and behaves the same as previous versions of the driver. The driver also includes a monotonic timestamp generator which assigns microsecond granular timestamps client-side and is useful for applications that plan to make rapid mutations from a single driver instance. In that case, it can prevent writes from a single driver instance from being reordered.

Using the monotonic timestamp generator

$cluster = Cassandra::cluster()
              ->withContactPoints('127.0.0.1')
              ->withTimestampGenerator(new Cassandra\TimestampGenerator\Monotonic())
              ->build();

# Insert and update requests will now be assigned a client-side timestamp using the
# monotonic timestamp generator...

Timestamps can also be assigned for each individual request using Cassandra\ExecutionOptions.

Assigning a client-side timestamp per request

$simple = new Cassandra\SimpleStatement(
    "INSERT INTO playlists (id, song_id, artist, title, album) " .
    "VALUES (62c36092-82a1-3a00-93d1-46196ee77204, ?, ?, ?, ?)"
);

$arguments = array(
    new Cassandra\Uuid('756716f7-2e54-4715-9f00-91dcbea6cf50'),
    'La Petite Tonkinoise',
    'Bye Bye Blackbird',
    'Joséphine Baker'
);
$options = new Cassandra\ExecutionOptions(array(
    'arguments' => $arguments,
    'timestamp' => 1234 # A timestamp can be be assigned per request in execution options
));
$session->execute($simple, $options);

$statement = new Cassandra\SimpleStatement(
  "SELECT artist, title, album, WRITETIME(song_id) FROM simplex.playlists");
$result    = $session->execute($statement);

foreach ($result as $row) {
  echo $row['artist'] . ": " . $row['title'] . " / " . $row['album'] . " (". $row['writetime(song_id)'] . ")\n";
}

Support Named Arguments when using Cassandra\SimpleStatement

It is now possible to name arguments when using SimpleStatement. In previous releases only positional arguments were supported for simple statement queries, that is, arguments denoted with "?" needed to be bound to a query in the same order as they appeared in the query string.

Named parameters now work with simple statement insert queries

$simple = new Cassandra\SimpleStatement(
    "INSERT INTO playlists (id, song_id, artist, title, album) " .
    "VALUES (62c36092-82a1-3a00-93d1-46196ee77204, ?, ?, ?, ?)"
);

# Using named arguments now works with simple statements!
$arguments = array(
    'song_id' => new Cassandra\Uuid('756716f7-2e54-4715-9f00-91dcbea6cf50'),
    'title'   => 'La Petite Tonkinoise',
    'album'   => 'Bye Bye Blackbird',
    'artist'  => 'Joséphine Baker'
);

$options = new Cassandra\ExecutionOptions(array(
    'arguments' => $arguments,
));

$session->execute($simple, $options);

This version of the driver also allows parameters to be named using the ":<name>" syntax. Named arguments can still be used in conjunction with prepared queries, but are most useful for non-prepared queries where metadata for the parameters' names are not available.

Using ":<name>" parameters with a simple statement

$statement = new Cassandra\SimpleStatement(
    "SELECT * FROM simplex.playlists " .
    "WHERE id = :id AND artist = :artist AND title = :title AND album = :album"
);

$options = new Cassandra\ExecutionOptions(
    array('arguments' =>
        array(
            'id'     => new Cassandra\Uuid('62c36092-82a1-3a00-93d1-46196ee77204'),
            'artist' => 'Joséphine Baker',
            'title'  => 'La Petite Tonkinoise',
            'album'  => 'Bye Bye Blackbird'
        )
    )
);

$result = $session->execute($statement, $options);

$row = $result->first();
echo $row['artist'] . ": " . $row['title'] . " / " . $row['album'] . "\n";

Retry Policies

The use of retry policies allows the PHP driver to automatically handle server-side failures when Cassandra is unable to fulfill the consistency requirements of a request. The default retry policy will only retry a request when it will preserve the original consistency level and when it is likely to succeed (there are enough replicas). The default retry policy can be overridden per session by using Cluster::withRetryPolicy() or it can be set per request using the execution option "retry_policy".

Changing the default policy to the downgrading consistency policy

$cluster     = Cassandra::cluster()
                 ->withContactPoints('127.0.0.1')
                 ->withRetryPolicy(new Cassandra\RetryPolicy\DowngradingConsistency())
                 ->build();

$session     = $cluster->connect();

# ...

The driver also provides a fall-through policy that always returns an error and a logging policy which can be used in conjunction with other policies to log their retry decisions.

Chaining the downgrading policy to the logging policy

$retry_policy = new Cassandra\RetryPolicy\DowngradingConsistency();

$cluster     = Cassandra::cluster()
                 ->withContactPoints('127.0.0.1')
                 ->withRetryPolicy(new Cassandra\RetryPolicy\Logging($retry_policy))
                 ->build();

$session     = $cluster->connect();

# ...

Retry policies can also be assigned per-request using the "retry_policy" execution option.

Assigning a retry policy to a specific request

$statement   = new Cassandra\SimpleStatement("INSERT INTO playlists (id, song_id, artist, title, album)
                                              VALUES (62c36092-82a1-3a00-93d1-46196ee77204, ?, ?, ?, ?)");

$arguments   = array(new Cassandra\Uuid('756716f7-2e54-4715-9f00-91dcbea6cf50'),
                    'Joséphine Baker',
                    'La Petite Tonkinoise',
                    'Bye Bye Blackbird'
                    );

$retry_policy = new Cassandra\RetryPolicy\DowngradingConsistency();

# This specific retry policy is used for on this single request
$options     = new Cassandra\ExecutionOptions(array(
                    'consistency' => Cassandra::CONSISTENCY_QUORUM,
                    'arguments' => $arguments,
                    'retry_policy' => new Cassandra\RetryPolicy\Logging($retry_policy)
                    ));

$session->execute($statement, $options);

# ...

Raw Paging Token

Previously, the PHP driver handled paging transparently by managing the paging state internally. It is now possible to access this paging state token using Cassandra\Row::pagingStateToken() and later use this token to resume paging by setting the "paging_state_token" execution option when executing a statement. This allows client applications to store this token for later use. The paging state should not be exposed to or come from untrusted environments.

Using the paging state token to page results

$cluster   = Cassandra::cluster()
               ->withContactPoints('127.0.0.1')
               ->build();
$session   = $cluster->connect("simplex");
$statement = new Cassandra\SimpleStatement("SELECT * FROM entries");
$options = array('page_size' => 2);
$result = $session->execute($statement, new Cassandra\ExecutionOptions($options));

foreach ($result as $row) {
  printf("key: '%s' value: %d\n", $row['key'], $row['value']);
}

while ($result->pagingStateToken()) {
    # The previous paging state token is used to get the next page of results
    $options = array(
        'page_size' => 2,
        'paging_state_token' => $result->pagingStateToken()
    );

    $result = $session->execute($statement, new Cassandra\ExecutionOptions($options));

    foreach ($result as $row) {
      printf("key: '%s' value: %d\n", $row['key'], $row['value']);
    }
}

Disable Schema Metadata

Schema metadata is kept up-to-date by the driver for use by client applications, either directly, or in the 1.1 release it can be used to construct complex data types such as UDTs, tuples and collections. It is also used by the token aware policy to determine the replication strategy of keyspaces. However, some applications might wish to eliminate this overhead. It is now possible to prevent the driver from retrieving and maintaining the schema metadata. This can be used to improve startup performance in applications with short-lived sessions or applications where schema metadata isn't used.

$cluster   = Cassandra::cluster()
                   ->withContactPoints('127.0.0.1')
                   ->withSchemaMetadata(false) # Disable schema metadata
                   ->build();
$session   = $cluster->connect("simplex");
$schema    = $session->schema();
print count($schema->keyspaces()) . "\n"; # "0"

Internal improvements

This release also includes the following internal improvements:

  • The default consistency is now LOCAL_ONE instead of ONE
  • Fixed encoding/decoding for decimal and varint

Looking forward

This release brings with it full support for Apache Cassandra 2.1 along with many other great features including support for PHP 7! In the next release we will be focusing our efforts on supporting Apache Cassandra 2.2 and 3.0. Let us know what you think about the 1.1 GA release. Your feedback is important to us and it influences what features we prioritize. To provide feedback use the following:









DataStax has many ways for you to advance in your career and knowledge.

You can take free classes, get certified, or read one of our many white papers.



register for classes

get certified

DBA's Guide to NoSQL







Comments

  1. kevin says:

    how to release the Rows memory usage when i use nextPage scan a table to fix some data error.(PHP dirver)

  2. Michal says:

    When will be version 1.2 released?

Comments

Your email address will not be published. Required fields are marked *




Subscribe for newsletter:

Tel. +1 (650) 389-6000 Offices France GermanyJapan

DataStax Enterprise is powered by the best distribution of Apache Cassandra™.

© 2018 DataStax, All Rights Reserved. DataStax, Titan, and TitanDB are registered trademark of DataStax, Inc. and its subsidiaries in the United States and/or other countries.
Apache Cassandra, Apache, Tomcat, Lucene, Solr, Hadoop, Spark, TinkerPop, and Cassandra are trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.