When I went through first learning steps with Kinesis, Firehose, and Redshift today, I was pleased to discover that Amazon had a "try our demo data producer" setup.
I was frustrated to learn that it does not seem to actually work.
So, I went digging. And I found STL_LOAD_ERROR
contained errors suggesting that a delimiter was expected, and records' fronts that looked like {field:val,field:val}{field:val,field:val}
.
...{"TICKER_SYMBOL": | 1214 | Delimiter not found
"Must be stripping newlines somewhere," I thought.
After digging, I found that there are production records in the relevant S3 bucket, in a surprising format:
{field:val,field:val}{field:val,field:val}...
That is, there are no delimiters between the apparent records, which are single line files of several dozen K each.
Other SO posts seem to suggest that this is actually the expected data format.
Why does Redshift need data in a format the data demo doesn't use? Which do I reconfigure?
Okay. There were three problems.