Close form solution for finding a root

Suppose I have a Pandas Series s whose values sum to 1 and whose values are also all greater than or equal to 0. I need to subtract a constant from all values such that the sum of the new Series is equal to 0.6. The catch is, when I subtract this constant, the values never end up less than zero.

In math formula, assume I have a series of x‘s and I want to find k

enter image description here

MCVE

import pandas as pd
import numpy as np
from string import ascii_uppercase

np.random.seed([3, 141592653])
s = np.power(
    1000, pd.Series(
        np.random.rand(10),
        list(ascii_uppercase[:10])
    )
).pipe(lambda s: s / s.sum())

s

A    0.001352
B    0.163135
C    0.088365
D    0.010904
E    0.007615
F    0.407947
G    0.005856
H    0.198381
I    0.027455
J    0.088989
dtype: float64

The sum is 1

s.sum()

0.99999999999999989

What I’ve tried

I can use Newton’s method (among others) found in Scipy’s optimize module

from scipy.optimize import newton

def f(k):
    return s.sub(k).clip(0).sum() - .6

Finding the root of this function will give me the k I need

initial_guess = .1
k = newton(f, x0=initial_guess)

Then subtract this from s

new_s = s.sub(k).clip(0)
new_s

A    0.000000
B    0.093772
C    0.019002
D    0.000000
E    0.000000
F    0.338583
G    0.000000
H    0.129017
I    0.000000
J    0.019626
dtype: float64

And the new sum is

new_s.sum()

0.60000000000000009

Question

Can we find k without resorting to using a solver?

Solution:

Updated: Three different implementations – interestingly, the least sophisticated scales best.

import numpy as np

def f_sort(A, target=0.6):
    B = np.sort(A)
    C = np.cumsum(np.r_[B[0], np.diff(B)] * np.arange(N, 0, -1))
    idx = np.searchsorted(C, 1 - target)
    return B[idx] + (1 - target - C[idx]) / (N-idx)

def f_partition(A, target=0.6):
    target, l = 1 - target, len(A)
    while len(A) > 1:
        m = len(A) // 2
        A = np.partition(A, m-1)
        ls = A[:m].sum()
        if ls + A[m-1] * (l-m) > target:
            A = A[:m]
        else:
            l -= m
            target -= ls
            A = A[m:]
    return target / l            

def f_direct(A, target=0.6):
    target = 1 - target
    while True:
        gt = A > target / len(A)
        if np.all(gt):
            return target / len(A)
        target -= A[~gt].sum()
        A = A[gt]

N = 10
A = np.random.random(N)
A /= A.sum()

print(f_sort(A), np.clip(A-f_sort(A), 0, None).sum())
print(f_partition(A), np.clip(A-f_partition(A), 0, None).sum())
print(f_direct(A), np.clip(A-f_direct(A), 0, None).sum())

from timeit import timeit
kwds = dict(globals=globals(), number=1000)

N = 100000
A = np.random.random(N)
A /= A.sum()

print(timeit('f_sort(A)', **kwds))
print(timeit('f_partition(A)', **kwds))
print(timeit('f_direct(A)', **kwds))

Sample run:

0.04813686999999732 0.5999999999999999
0.048136869999997306 0.6000000000000001
0.048136869999997306 0.6000000000000001
8.38109541599988
2.1064437470049597
1.2743922089866828
Advertisements

Are hardcoded conditions optimized by JVM in Android?

Due to debugging reason most parts of the code in my application has this recurrent portion of code:

public static final boolean DEBUG = true; // just created once in a "Utility" class

  if (Utility.DEBUG)
     Log.d("TIMER", /*string message that is strictly related to context*/);

Now, if the boolean values turn to false this becomes dead code. My question is if in this case the Android compiler would do basics optimizations such as constant folding and dead code remotion?

If the answer is no, what is the best way to whip out in release phase the debugging logs?

Solution:

Yes; in the case of static final fields the compiler can and will remove the unreachable section. If you examine the byte-code you can validate this yourself.

NumPy ufuncs are 2x faster in one axis over the other

I was doing some computation, and measured the performance of ufuncs like np.cumsum over different axes, to make the code more performant.

In [51]: arr = np.arange(int(1E6)).reshape(int(1E3), -1)

In [52]: %timeit arr.cumsum(axis=1)
2.27 ms ± 10.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [53]: %timeit arr.cumsum(axis=0)
4.16 ms ± 10.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

cumsum over axis 1 is almost 2x faster than cumsum over axis 0. Why is it so and what is going on behind the scenes? It’d be nice to have a clear understanding of the reason behind it. Thanks!

Solution:

You have a square array. It looks like this:

1 2 3
4 5 6
7 8 9

But computer memory is linearly addressed, so to the computer it looks like this:

1 2 3 4 5 6 7 8 9

Or, if you think about it, it might look like this:

1 4 7 2 5 8 3 6 9

If you are trying to sum [1 2 3] or [4 5 6] (one row), the first layout is faster. If you are trying to sum [1 4 7] or [2 5 8], the second layout is faster.

This happens because loading data from memory happens one “cache line” at a time, which is typically 64 bytes (8 values with NumPy’s default dtype of 8-byte float).

You can control which layout NumPy uses when you construct an array, using the order parameter.

For more on this, see: https://en.wikipedia.org/wiki/Row-_and_column-major_order

pyDash – A Web Based Linux Performance Monitoring Tool

pydash is a lightweight web-based monitoring tool for Linux written in Python and Django plus Chart.js. It has been tested and can run on the following mainstream Linux distributions: CentOS, Fedora, Ubuntu, Debian, Arch Linux, Raspbian as well as Pidora.

You can use it to keep an eye on your Linux PC/server resources such as CPUs, RAM, network stats, processes including online users and more. The dashboard is developed entirely using Python libraries provided in the main Python distribution, therefore it has a few dependencies; you don’t need to install many packages or libraries to run it.

In this article, we will show you how to install pydash to monitor Linux server performance.

How to Install pyDash in Linux System

1. First install required packages: git and Python pip as follows:

-------------- On Debian/Ubuntu -------------- 
$ sudo apt-get install git python-pip
-------------- On CentOS/RHEL -------------- 
# yum install epel-release
# yum install git python-pip
-------------- On Fedora 22+ --------------
# dnf install git python-pip

2. If you have git and Python pip installed, next, install virtualenv which helps to deal with dependency issues for Python projects, as below:

# pip install virtualenv
OR
$ sudo pip install virtualenv

3. Now using git command, clone the pydash directory into your home directory like so:

# git clone https://github.com/k3oni/pydash.git
# cd pydash

4. Next, create a virtual environment for your project called pydashtest using the virtualenv command below.

$ virtualenv pydashtest #give a name for your virtual environment like pydashtest

Create Virtual Environment

Important: Take note the virtual environment’s bin directory path highlighted in the screenshot above, yours could be different depending on where you cloned the pydash folder.

5. Once you have created the virtual environment (pydashtest), you must activate it before using it as follows.

$ source /home/aaronkilik/pydash/pydashtest/bin/activate

Active Virtual Environment

From the screenshot above, you’ll note that the PS1 prompt changes indicating that your virtual environment has been activated and is ready for use.

6. Now install the pydash project requirements; if you are curious enough, view the contents of requirements.txt using the cat command and the install them using as shown below.

$ cat requirements.txt
$ pip install -r requirements.txt

7. Now move into the pydash directory containing settings.py or simple run the command below to open this file to change the SECRET_KEY to a custom value.

$ vi pydash/settings.py

Set Secret Key

Save the file and exit.

8. Afterward, run the django command below to create the project database and install Django’s auth system and create a project super user.

$ python manage.py syncdb

Answer the questions below according to your scenario:

Would you like to create one now? (yes/no): yes
Username (leave blank to use 'root'): admin
Email address: aaronkilik@gmail.com
Password: ###########
Password (again): ############

Create Project Database

9. At this point, all should be set, now run the following command to start the Django development server.

$ python manage.py runserver

10. Next, open your web browser and type the URL: http://127.0.0.1:8000/ to get the web dashboard login interface. Enter the super user name and password you created while creating the database and installing Django’s auth system in step 8 and click Sign In.

pyDash Login Interface

11. Once you login into pydash main interface, you will get a section for monitoring general system info, CPU, memory and disk usage together with system load average.

Simply scroll down to view more sections.

pyDash Server Performance Overview

12. Next, screenshot of the pydash showing a section for keeping track of interfaces, IP addresses, Internet traffic, disk read/writes, online users and netstats.

pyDash Network Overview

13. Next is a screenshot of the pydash main interface showing a section to keep an eye on active processes on the system.

pyDash Active Linux Processes