Skip to content

Uts experiments: first pass at Claude-generated test specs for REST API#421

Closed
paddybyers wants to merge 7 commits intomainfrom
uts-experiments
Closed

Uts experiments: first pass at Claude-generated test specs for REST API#421
paddybyers wants to merge 7 commits intomainfrom
uts-experiments

Conversation

@paddybyers
Copy link
Member

There are clearly lots of issues with this at this stage but it's is raised as a PR to track the feedback given to Claude

Integration tests run against the Ably sandbox environment:
- `POST https://sandbox.realtime.ably-nonprod.net/apps` to provision app
- Use `endpoint: "sandbox"` in ClientOptions
- Test real server behavior and validation
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add:
Test apps created using this endpoint should be created once in the setup for a test run, and explicitly cleared when complete. Multiple tests can run against a single app so long as there is no conflict between the state created between those tests. Therefore, any channels created by tests within sandbox apps should be unique for each test. The preferred approach to ensuring uniqueness is to construct channel names as a combination of a descriptive part that refers to the test (eg including the name of the test, or the ID of the spec item) plus a random part that's sufficiently large to ensure the risk of collision is negligible (eg a base64-encoded 48 bit number).

api_key = app_config.keys[0].key_str

AFTER ALL TESTS:
# Sandbox apps auto-delete after 60 minutes
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sandbox app should be explicitly deleted


### Test Steps
```pseudo
result = AWAIT client.time()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not use time() because this does not require authentication. Use the channel status endpoint instead.


### Assertions
```pseudo
ASSERT result IS valid timestamp
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be updated since the test will no longer use time(). However, since this test aims to check that authentication is working, it is not necessary to check that the response body satisfies any particular condition; just check that the request succeeded and wasn't rejected with an authentication or authorization failure

@@ -0,0 +1,324 @@
# Auth Integration Tests

Spec points: `RSA4`, `RSA8`, `RSA9`, `RSA10`, `RSA14`, `RSA15`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All relevant functionality should be integration-tested with JWTs as well as with Ably native tokens (obtained using requestToken()). JWT should be the primary token format used. Native tokens, and the correct handling of token requests etc, should be tested in a way that's as independent as possible from testing the mechanisms relating to handling tokens in requests and the token renewal process via authCallback and authURL

))

# Verify token works
result = AWAIT token_client.time()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, don't use time() - use channel status

ASSERT token_details.token IS String
ASSERT token_details.token.length > 0
ASSERT token_details.expires > now()
ASSERT result IS valid timestamp
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't verify the response from the channel status API - just verify that the request succeeded


---

## RSA9 - createTokenRequest creates signed request
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not need to be an integration test; createTokenRequest() is just a local operation that signs a token request with the API key

# Create token request
token_request = AWAIT client.auth.createTokenRequest()

# Exchange it for a token (simulating another client)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a valid use of requestToken(); it does not take a TokenRequest as an argument. Check the spec (RSA8).
Instead, create an authCallback which uses a client to generate and return a TokenRequest


---

## RSA10 - authorize() obtains and uses token
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think an integration test is helping here. I would have a test that mocks the HTTP client. This would enable a much more explicit confirmation that a new token is used on subsequent requests

### Test Steps
```pseudo
# First request obtains token
time1 = AWAIT client.time()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not use time()

client = Rest(options: ClientOptions(
key: api_key,
endpoint: "sandbox",
defaultTokenParams: TokenParams(ttl: 5000) # 5 second TTL
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be shorter, eg 2s

first_token = client.auth.tokenDetails.token

# Wait for token to expire
WAIT 6 seconds
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to have a way to do this kind of test without (i) making it take too much time, or (ii) it being flaky because we haven't waited for long enough to be certain that the token expired.

I suggest a test that waits a fixed time for the token to expire (2s, the same as the TTL) and then polls an endpoint (say at 500ms intervals) for a period of time (say 5s) until it gets a rejection. This means that we won't get flakes as a result of minor clock skew, and we won't have to wait longer than necessary for the test to complete


---

## RSA15 - Expired token rejected
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RSA15 is not about token expiry. I do not understand why this test has been created.


---

## RSA - clientId in token
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RSA what?

api_key = app_config.keys[0].key_str

AFTER ALL TESTS:
# Sandbox apps auto-delete after 60 minutes
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment elsewhere. Sandbox apps should explicitly deleted

AWAIT unique_channel.publish(name: "event3", data: { "key": "value" })

# Wait for persistence
WAIT 1 second
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments elsewhere about fixed waits. Consider polling to minimise the expected test duration without introducing a flaky test

```pseudo
# Publish with small delays to ensure ordering
AWAIT unique_channel.publish(name: "first", data: "1")
WAIT 100 milliseconds
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wait here?

FOR i IN 1..15:
AWAIT unique_channel.publish(name: "event-" + str(i), data: str(i))

WAIT 2 seconds
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all tests in this file, see comments elsewhere about the use of WAIT

```pseudo
BEFORE ALL TESTS:
app_config = provision_sandbox_app()
api_key = app_config.keys[0].key_str
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests in this file require a key with restricted capabilities. The app that is created for any given test run must include all fixtures needed for all tests. This means that it should contain multiple keys - at least one key with rights to all channels, and one key with rights to a subset of channels

api_key = app_config.keys[0].key_str

AFTER ALL TESTS:
# Sandbox apps auto-delete after 60 minutes
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicitly delete the sandbox app

### Setup
```pseudo
client = Rest(options: ClientOptions(
key: api_key,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a key with restricted rights


---

## RSL1k4 - Idempotent publish with library-generated IDs (retry verification)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should be an integration test - we don't yet have a proxy that simulates certain failure events. Instead have a test with a mocked HTTP client that fails the first attempt


## RSL1k5 - Idempotent publish with client-supplied IDs (explicit duplicate)

Tests that multiple publishes with the same client-supplied ID result in single message.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, this shouldn't be an integration test

FAIL("Expected exception not thrown")
CATCH AblyException as e:
ASSERT e.statusCode == 400 OR e.statusCode == 401
ASSERT e.message CONTAINS "clientId" OR e.message CONTAINS "mismatch"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should check for a specific error code


## RSA8d - authCallback invocation

Tests that `authCallback` is invoked with `TokenParams` and returns token.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This description says that the authCallback responds with a token, whereas the code actually returns a tokendetails


---

## RSA8d - authCallback returns different token types
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test appears to be a superset of the test above. We should probably just delete the test above.
This test should have a case where the token supplied by the authCallback is a JWT

ASSERT token_details.token == "obtained-token"

# Verify token is now used for requests
AWAIT client.time()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not use time() - it does not require authentication

tokenParams: TokenParams(clientId: "saved-client", ttl: 3600000)
)

# Wait for token to expire
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test uses a mocked HTTP client so there is no need to actually wait for token expiry.

The first invocation should receive an error response from the mock HTTP client that has an code of 40171. This response indicates token expiry to the client. This should trigger a token renewal cycle by the client


### Assertions
```pseudo
ASSERT original_callback_called == false
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do these assertions confirm that the authOptions replace the defaults?

@@ -0,0 +1,312 @@
# Client ID Tests

Spec points: `RSA7`, `RSA7a`, `RSA7b`, `RSA7c`, `RSA12`, `RSA12a`, `RSA12b`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No tests below reference RSA7b. Where is it tested?

AWAIT client.time() # Or any operation requiring auth
FAIL("Expected exception due to clientId mismatch")
CATCH AblyException as e:
ASSERT e.message CONTAINS "clientId" OR e.message CONTAINS "mismatch"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should check for a specific error code in the response. I think this would be 40012

```

### Note
The exact timing of mismatch detection (constructor vs first use) may vary by implementation. The key requirement is that the mismatch is detected and reported as an error.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this ambiguity represented in the tests.

If the tests accommodate this ambiguity by, for example performing both operations and expecting an error at some point, then this should be noted in a comment in the tests where this appears.

```pseudo
request = mock_http.captured_requests[0]

# Either limit param is absent (server default) or explicitly "100"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The client should not append a limit param if the caller does not specify it. I don't think this test should exist


## RSL1k2 - Serial increments for batch publish

Tests that serial numbers increment for each message in a batch.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this description aligns with the definition of this spec point


## RSL1k - Client-supplied ID preserved

Tests that client-supplied message IDs are not overwritten.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see anything in the spec for this spec point that refers to preservation of clientId. I don't think this test should exist


---

## TI - AblyException wraps ErrorInfo
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any reference to AblyException in the spec. Where did this test come from?


## TI - Common error codes

Tests that common Ably error codes are handled correctly.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is meaningless. I don't think this exercises any functionality


---

## TI - Error string representation
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why this test is here. There is no requirement that error messages contain the error code


---

## TI - Error equality
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what in the specification requires that this test exist


## TM5 - Message equality

Tests that messages can be compared for equality.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why this test exists. I don't think it is relevant to TM5.


---

## TO3 - ClientOptions attributes
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all libraries will have ClientOptions as a public type in the form expected by these tests because that might not be an idiomatic way to pass arguments to the Ably constructor. For example the library might expose a builder pattern. These tests should be expressed in a way that's compatible with that

@@ -0,0 +1,304 @@
# PaginatedResult Types Tests
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these tests substantially duplicate what is tested in the paginagedresult tests. I don't think we need them


---

## TG - Link header parsing
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep these

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant