Probability and statistics

https://d2l.ai/chapter_preliminaries/probability.html

1 Like

In the third calculation in section 2.6.3: Pr(D1=1 and D2=1)=0.0001⋅0.9985+0.98⋅0.0015=0.00176955, Pr(D1=1 and D2=1|H=0) is 0.0003 as calculated in the first equation, not 0.0001, although the final result is right:)

Hi, I would like to understand the output of this print(nd.random.multinomial(probabilities, shape=(10))) [3 5 2 3 3 2 2 1 5 0]

It is 10 times roll of a dice. So is the output saying 1 was rolled 3 times, 2 for 5 and so on till 6 ? or it is returning the output of each dice roll (but then it also gives a 0)
Thank you

@saruvora,

It is 10 times roll of a die. This array is the actual outputs. In this example, the numbers on the die go from 0 to 5, not 1 to 6 compared to an actual physical die.

Effectively in your example, you rolled:
first a 3, then a 5, then a 2, then a 3, then a 3, then a 2, then a 2, then a 1, then a 5, then a 0.

3 Likes

Hi, I would like to clarify where the third line in the following snippet from section “4.3 Normal distribution” comes from:

# Generate 10 random sequences of 10,000 uniformly distributed random variables
tmp = np.random.uniform(size=(10000,10))
x = 1.0 * (tmp > 0.3) + 1.0 * (tmp > 0.8)
mean = 1 * 0.5 + 2 * 0.2
variance = 1 * 0.5 + 4 * 0.2 - mean**2
print('mean {}, variance {}'.format(mean, variance)) 

In particular, does (tmp > 0.3) means take all values above 0.3 and where is this 0.3 and 0.8 coming from anyway?

Yes, (tmp > 0.3) returns a boolean array with same shape as tmp and value of True whenever tmp[i,j] > 0.3.

Line 3 “transforms” the uniform variables generated in line 2 into random variables X defined above in the text such that P(0) = 0.3, P(1) = 0.5, P(2) = 0.2.

This is where 0.3 comes from, and 0.8 is just 0.3 + 0.5 (or, if you’d like to see it another way, 1 - 0.2).

Hope this is not too confusing, it’s really easier to see it than to write it, if you know what I mean.

Just noticed a minor typo in the opening paragraph:

high reward under each of the available action.

“action” should be “actions”.

The book is great so far!

Hi, i ran the code below many times

%matplotlib inline
import d2l
from mxnet import np, npx
import random
npx.set_np()
fair_probs = [1.0 / 6] * 6
p = [1/2, 1/2, 0]
np.random.multinomial(100, fair_probs), np.random.multinomial(100, p)

and always get the result

(array([ 0, 0, 0, 0, 0, 100], dtype=int64),
array([ 0, 0, 100], dtype=int64))

and I don’t konw why, I have run through the codes of the previous sections and got the correct result.
My python version is 3.7, mxnet version is 1.6.0b20200215

1 Like

Hi @yoyoyoohh,

I run your code and get the expected results:

(array([21, 21, 15,  9, 15, 19], dtype=int64), array([48, 52,  0], dtype=int64))

Can you check what version of mxnet and d2l that you are using?

Hi @gold_piggy,
my mxnet version is 1.6.0b20191125, d2l version is 0.11.3,
I use the random class from mxnet and from numpy, and find that the result of numpy is correct, while mxnet is wrong, here is the code:

from mxnet import npx
from mxnet import np as mxnet_np
import numpy
import random
import mxnet
npx.set_np()
fair_probs = [1.0 / 6] * 6
random.seed(0)
print(numpy.random.multinomial(100, fair_probs))
print(mxnet_np.random.multinomial(100, fair_probs))

and the result:

[22 16 13 19 20 10]
[ 0 0 0 0 0 100]

2 Likes

Hi @yoyoyoohh, please try to upgrade the “nightly” MXNet:

pip uninstall mxnet -y

pip install --pre mxnet

Let me know whether helps!

Hi @gold_piggy, I upgraded the mxnet, but when I import mxnet in python file, I got this error:

RuntimeError: Cannot find the MXNet library.
List of candidates:
G:\anaconda\lib\site-packages\mxnet\libmxnet.dll
G:\anaconda\lib\site-packages\mxnet../…/lib/libmxnet.dll
G:\anaconda\lib\site-packages\mxnet../…/build/libmxnet.dll
G:\anaconda\lib\site-packages\mxnet../…/build\libmxnet.dll
G:\anaconda\lib\site-packages\mxnet../…/build\Release\libmxnet.dll
G:\anaconda\lib\site-packages\mxnet../…/windows/x64\Release\libmxnet.dll

google says I need to build the mxnet library, but I thought it is unnecessary to build the library which is installed from pip. I don’t know how to do next.

@mli

Hi, I have some suggestions for improvement to the “AIDS example”.

First, there is no test for AIDS! There is a test for HIV.

Second, the way the notation is used for D and H is confusing. For example, you start by using D_1\mbox{,} but before a second example has been seen or indicated, the 1 is ambiguous. Re-read your introduction of this variable:

We use D_1 to indicate the diagnosis (1 if positive and 0 if negative)

It reads like you meant, D_1 is for positive, and so readers are primed to expect D_0 for negative! There are too many 1s and 0s flying around.

Third, it’s hard to read because you use H to indicate “HIV status” (and by the way, that is your first and only mention of HIV in the example). I confused it with healthy, since you used that word in the sentence before, and it made more intuitive sense to me. You repeatedly say “AIDS” and “healthy”, and what you meant was “HIV” and “tested positive”. Overall, this just leads to confusion for what should be a straightforward example. I think you should just cleanup your English and notation a little bit here.

Lastly, when you go to solve the problem D_1 you should first briefly itemize your list of knowns/unknowns so it’s clear to readers why you’re computing what you’re computing. You threw an extra number in at the last second,

Assume that the population is quite healthy, e.g., P(H=1)=0.0015

after you already asked the question, and the order should be reversed so readers can understand the path forward to solving the problem. I was scratching my head when I read

Let us work out the probability of the patient having AIDS if the test comes back positive

since there wasn’t enough information at this point to actually solve.

Hi admins,

When i run the code in the notebooks which i downloaded by following the Installation section in the book a week back, i get the following error when running cell 9 in probability.pynb .

module ‘d2l’ has no attribute ‘set_figsize’

Has the d2l package changed after the book was published? If so, could you tell me which package version will be able to run the above code without errors?

This is good. Nice work with probability stuff! Looking forward for more…

Hey… did you find a solution yet? please let me know if you found it.

Hi @pratikjain227: I downloaded one d2l package version that was released around the time the book came out. Specifically, 0.11.2. That solves it.

pip install d2l==0.11.2

1 Like

If you use windows look at the issue https://github.com/apache/incubator-mxnet/issues/15383#issuecomment-637583446.

Hi @entangledloops, great suggestions! Thanks

Hi @all, sorry for late reply. We have moved to new discussion portal https://discuss.d2l.ai/ to be framework agnostic! Please feel free to throw your questions there :slight_smile: