I seem to be facing a weird bug or making a mistake somewhere in a simple SQL query. I'm using pandas and pandasql to bake an image to be run on AWS lambda.
Input CSV:
| Org | Usage | Bucket |
|---|---|---|
| Abc | 1GB | test-bucket-1 |
| Def | 2GB | test-bucket-10 |
CLI and Output:
import pandas as pd
from pandasql import sqldf
import boto3
df = pd.read_csv("testdata.csv", sep=",", header=0)
query = """select * from df where Bucket='test-bucket-1'"""
res=sqldf(query)
res.to_csv('pandas-op.csv')
print(res)
0,Abc,1GB,test-bucket-1
1,Def,2GB,test-bucket-10
I only expect 0,Abc,1GB,test-bucket-1 to be shown. Why is it including partial match?