Overview
IMDB doesn't provide their API publicly and if you really need to use their APIs, you'd need to pay thousands of freedom currency units per year. That's way too much. I certainly don't make that much money to pay for this. But I also kind of want to use their API. Looks like we need to figure out how they use their own API first.
To do this, we need to install mitmproxy. It's a fantastic tool to observe network traffic. Just make sure to install the CA certificate in system store1.
Fortunately, IMDB doesn't pin certificates, which means installing
mitmproxy
's certificate in system store can successfully
decrypt the connection from the IMDB app. Having the certificate in the
system store is necessary as since android 7, app ignore user
certificate store. When you install a certificate through the usual
means on android, it installs it in the user certificate store.
Follow mitmproxy
's Getting
Started guide to set it up. For me, it was as simple as
apt install mitmproxy
and running mitmproxy
or
mitmweb
(for the web interface). Then Setting up a proxy on
android in wifi settings (it's under "advanced") and running the
official IMDB app.
Public API Endpoints
Before we get any further, IMDB has public undocumented apis that don't need authentication. The only interesting one I found is the search suggestions api.
curl -X GET "https://v3.sg.media-imdb.com/suggestion/a/<slug>.json"
with <slug>
being the urlencoded title of what you
want to find or the imdb id. The response is self-explanatory. Most
interesting is the qid
property which helps with filtering
extra fluff. Here's what I got searching for
rick and morty
.
{"d": [
{"i": {
"height": 1920,
"imageUrl": "https://m.media-amazon.com/images/M/MV5BZjRjOTFkOTktZWUzMi00YzMyLThkMmYtMjEwNmQyNzliYTNmXkEyXkFqcGdeQXVyNzQ1ODk3MTQ@._V1_.jpg",
"width": 1280
,
}"id": "tt2861424",
"l": "Rick and Morty",
"q": "TV series",
"qid": "tvSeries",
"rank": 78,
"s": "Justin Roiland, Chris Parnell",
"y": 2013
,
}...
,
]"q": "rick%20and%20morty",
"v": 1
}
Authenticated API Endpoints
Pretty much everything else needs authentication.
Get temporary credentials
Getting temporary credentials is the easy part. All you need is the
appKey
andcurl
. I suspectappKey
is specific to IMDB android app.curl -X POST -d '{"appKey": "c2a5f61b-8dea-44bc-b739-db7937519f4e"}' https://api.imdbws.com/authentication/credentials/temporary/android860
What you get back is everything you need to authenticate against "AWS Data Exchange" database or whatever it is. Apparently it's the "new" api.
{"@meta": { "operation": "GetTemporaryCredentials", "requestId": "xyz-123-xyz-123", "serviceTimeMs": "1.23" , }"resource": { "@type": "imdb.api.auth.credentials.temporary", "accessKeyId": "<alphanumeric>", "expirationTimeStamp": "1999-12-30T03:00:00Z", "secretAccessKey": "<alphanumeric>", "sessionToken": "<base64 encoded string>" } }
Everything under
"resource"
will be handy when doing authentication. ThesessionToken
will be the value ofx-amz-security-token
, andaccessKeyId
will be part ofx-amzn-authorization
header of future requests.Authenticate
This is the difficult part. There are certain headers that are required for authorization against AWS Data Exchange. Amazon has documentation for the expected values of
x-amz-date
,x-amz-security-token
, andx-amzn-authorization
.x-amzn-sessionid: 942-1698069-8532063 x-amz-date: <ISO-8601 date format> x-amz-security-token: <ALPHANUMERIC> x-amzn-authorization: <specific format>
In addition, the following headers are informational.
user-agent: IMDb/8.7.0.108700400 (Fairphone|FP3; Android 29; Fairphone) IMDb-flg/8.7.0 (1080,2016,422,428) IMDb-var/app-andr-ph accept: application/vnd.imdb.api+json
x-amz-date
This the ISO-8601 formatted date.
"x-amz-date"] = datetime.datetime.today().isoformat() headers[
x-amz-security-token
Another given one. This is the
sessionToken
from the credentials we got earlier"x-amz-security-token"] = credentials["sessionToken"] headers[
x-amzn-authorization
At last, the beast! It took a bit of searching to find the documentation page, and it's not obvious how SWF relates to Data Exchange. But the clue is the format of the header value between what is sent by IMDB and what is expected when looking at the docs. Let's break it down.
It starts with
AWS3
as a tag, followed byAWSAccessKeyId
which we get fromaccessKeyId
from the credentials. TheAlgorithm
is alwaysHmacSHA256
. Then there is aSignature
andSignedHeaders
.The way this header works is that we construct a string including some information about the request being sent (let's call it
string_to_sign
), and sign it. That's ourSignature
. The headers that we included instring_to_sign
then are listed in full underSignedHeaders
. This extra information from the app's communication helps us figure out what headers we need to include. A samplex-amzn-authorization
is as follows.AWS3 AWSAccessKeyId=ASIAYOLDPPJ6WMOMECUF,Algorithm=HmacSHA256,Signature=1meBNRwYsk+HVziftdJ/8Bpb1F9DG82Ss6dLLzlKHGk=,SignedHeaders=host;x-amz-date;x-amz-security-token;x-amzn-sessionid
To recreate this, we have the
x-amz-date
,x-amz-security-token
, and evenx-amzn-sessionid
which we can copy from the app. But what is the host?
Down the rabbit hole
The host
is not evident from the requests that are being
sent. This is where we need to get to the source. The next step then is
to get apktool
, dex2jar
, and
jd-gui
2 and disassemble the imdb apk.
In jd-gui
, a search for X-Amzn-Auth
(note
the capital letters) reveals RedactedHeaders
class - aptly
named.
arrayList.add("x-amz-security-token");
arrayList.add("X-Amzn-Authorization");
arrayList.add("x-imdb-authentication");
arrayList.add("x-imdb-map-authentication");
arrayList.add("x-imdb-map-authentication-token");
The most interesting function is public String sign()
.
The argument names are exposed by the calls to
kotlin.jvm.internals.Intrinsics.checkNotNullParameter
. I've
transcribed the code into python for no good reason at all other than
for my own understanding.
public String getStringToSign() {"host", hostname);
headers.put(
join
}
def sign(hostname: str,
str,
method: str,
path: str, str],
headers: Dict[str, str],
params: Dict[int],
array_of_bytes: List[
credentials: ZuluTemporaryCredentials):# getStringToSign(hostname, method, path, headers, params)
"host"] = hostname
headers[= "".join(method,
stringToSign "/" + urllib.parse.urlencode(path), # ZuluSigningHelper.getCanonicalizedResource
sorted(params)), # ZuluSigningHelper.getCanonicalizedQueryString
urllib.parse.urlencode("\n".join(["%s:%s" % (k, headers[k]) for k in sorted(headers.keys())]))[:30] # ZuluSigningHelper.canonicalHeaders
# ZuluSigningHelper.hash(stringToSign, array_of_bytes)
= hashlib.sha256()
digest "UTF-8"))
digest.update(stringToSign.encode(
digest.update(array_of_bytes)= digest.hexdigest()
hashedStringToSignWithBody
= calculateSignature(hashedStringToSignWithBody, credentials["secretAccessKey"])
signature
canonicalHeaders# ZuluSigner.getAuthorizationHeader(headers, signature, credentials)
= f"AWS3 AWSAccessKeyId={credentials['accessKeyId']},Algorithm=HmacSHA256,Signature={signature},SignedHeaders={ZuluSigningHelper.canonicalHeaderKeys(headers)}"[:62]
authorization_header
def getStringToSign(hostname, method, path, headers, params):
pass #blah blah blah
Looks like there is more that we are missing. Next is
ZuluSigningInterceptor
which looks very interesting. A
search for ZuluSign
reveals a world of wonder:
ZuluSigner
, which includes the methods
getAuthorizationHeader
and
getStringToSign
.
getAuthorizationHeader
starts with
"AWS3 AWSAccessKeyId"
followed by
getAccessKeyId
which is accessKeyId
. Then
Algorithm=HmacSHA256
. Signature
is calculated
by ZuluSignatureCalculator.calculateSignature
which is
passed to ZuluSigner
from somewhere, and
SignedHeaders
is taken from
ZuluSigningHelper.canonicalHeaderKeys
Importantly, in the real requests, SignedHeaders
includes only
host;x-amz-security-token;x-amzn-sessionid
.
At this point, I could guess the value of host
and
canonical resource path used to make the signature from the requests
that are being made. I'd guess host
is
api.imdbws.com
and the resource path is the url the request
is being made to, e.g.
/template/imdb-android-writable/8.7.title-persisted-metadata.jstl/render
.
I would also have to play around with parameters that may be passed and
I have no idea where to even look. That's too much guess work.
Frida and instrumentation
Why guess when you can observe. I had never used frida
instrumentation tools before, so it was a fun exercise. Install Frida on your phone,
then install
frida-tools
on your computer. Follow the official
instructions to set it up. My phone is a rooted LineageOS phone. I had
to setenforce 0
as root in termux
to allow
frida-server
to run.
Once frida is set up, we can begin experimenting.
frida-ps -U
shows a list of running apps, but it shows IMDB
app as IMDb
. To get the package name, run
frida-ps -U -a -i
(see frida-ps -h
for help).
This helpfully returns com.imdb.mobile
. Then, to run any
frida script against IMDB app, run:
frida -U -f com.imdb.mobile -l myscript.js
myscript.js
is the frida hook we'll write to monitor
calls to our target functions. Reading through the javascript api
reference, and bit of searching around, I eventually got to this
script:
.perform(function() {
Javavar calculatorActivity = Java.use("com.imdb.webservice.requests.zulu.ZuluSigner");
.getStringToSign.implementation = function(a, b, c, m, l) {
calculatorActivityvar retval = this.getStringToSign(a, b, c, m, l);
console.log("---BEGIN---");
console.log(retval);
console.log("--- END ---");
return retval
;
}; })
Read through the documentation for details, but essentially this
script tries to replace the implementation of the
getStringToSign
function with our function here, which
prints the return value (and returns it so the app can continue
functioning). This frida thing is magic!
As a side-note, the following snippet will be helpful to explore what
classes you have access to while using frida (to be used with
Java.use
). I used it to make sure I'm catching the right
class with Java.use
.
.enumerateLoadedClasses({
JavaonEnter: function(className) {
if (className.startsWith("com.imdb.webservice")) {
console.log(className);
},
}onComplete: function() {}
; })
The output is quite helpful and answers the remaining question.
GET
/template/imdb-android-writable/8.7.app-config.jstl/render
host:api.imdbws.com
x-amz-date:Thu, 15 Sep 2022 03:39:07 GMT
x-amz-security-token:somelonghexstringwhichwealreadyknowtheoriginof
x-amzn-sessionid:123-1231233-1231233
- Notes
- Two empty lines at the end, indicating the requests don't include a body
- Lack of parameters in the string to sign
- No spaces around colon (separates header from value)
The above info is exactly what we saw from the requests, but now we know. The rest is just implementing the rest of the signing procedure in python, an exercise left to the reader.