Using tkinter and python to continuously display the output from a system command

I put an answer to a stackoverflow question. The poster wanted to display the output from a ‘netstat’ command every second. I suggested using a tkinter screen. To run the nestat command every second, the command line entry would be ‘netstat 1’. This is fed to a subprocess. This subprocess is wrapped in a thread to avoid blocking the main thread. The main thread needs to be left to deal with the tkinter display. GUIs like to hog the main thread. Don’t forget to use the ‘undo=False’ option with tk.screen. Otherwise all of the display is continuously saved to a buffer. This results in the python process gobbling up memory continuously as the output from netstat is added to it each second.

import threading
from subprocess import Popen, PIPE
from time import sleep
import tkinter as tk
from tkinter import *

 

PROCESS = ['netstat','1']
class Console(tk.Frame):
    def __init__(self, master, *args, **kwargs):
        tk.Frame.__init__(self, master, *args, **kwargs)
        self.text = tk.Text(self, undo=False)
        self.text.pack(expand=True, fill="both")
        # run process in a thread to avoid blocking gui
        t = threading.Thread(target=self.execute)
        t.start()
 
 
    def display_text(self, p):
        display = ''
        lines_iterator = iter(p.stdout.readline, b"")
        for line in lines_iterator:
            if 'Active' in line:
                self.text.delete('1.0', END)
                self.text.insert(INSERT, display)
                display = ''
            display = display + line           


    def display_text2(self, p):
        while p.poll() is None:
            line = p.stdout.readline()
            if line != '':
                if 'Active' in line:
                    self.text.delete('1.0', END)
                self.text.insert(END, line)
                p.stdout.flush()       


    def execute(self):
            p = Popen(PROCESS,  universal_newlines=True,
                   stdout=PIPE, stderr=PIPE)
            print('process created with pid: {}'.format(p.pid))
            self.display_text(p)

  
if __name__ == "__main__":
    root = tk.Tk()
    root.title("netstat 1")
    Console(root).pack(expand=True, fill="both")
    root.mainloop()

Sublime Text 3, adding a custom python 3 build

Typing ‘python’ at the command line of my Linux Mint 18 install gives me a python 2.7 prompt. So when I run a python script in Sublime Text, it was built using Python 2.7. But I want to use python 3! So I entered a custom python 3 build.

I use Linux Mint 18. The “shell_cmd” mentioned below will be different for Windows and maybe for Mac OS as well.

To create a build option in Sublime Text 3 for your favorite version of Python, create a file called:

sublime_install/Data/Packages/User/Python3.sublime-build

Where sublime_install is the path to the directory where you have sublime installed.

The file should contain this text:

{
    "shell_cmd": "/usr/bin/env python3 -u ${file}",
    "selector": "source.python",
    "file_regex": "^(...*?):([0-9]*):?([0-9]*)",
    "working_dir": "${file_path}",
}

You may need to change ‘python3’ to whichever command prompt fires up the version of python you want to run.

The option ‘Python3’ will now appear in your build menu on Sublime Text 3.

The -u option in the “shell_cmd” removes buffering. I missed this out initially, leading to some head scratching.  My scripts would run, but I wouldn’t see any output for some time – until the output buffer had filled. Luckily  Stackoverflow came to my help: https://stackoverflow.com/questions/50296736/how-to-remove-output-buffering-when-running-python-in-sublime-text-3

Python 3, threading and references

Creating a thread

I used threading to enable real-time graphing of data from sensors. One thread collected data from the sensors. The main thread ran the real time graph. I had a few problems getting started. It came down to my incorrect use of brackets when creating the thread.

When we create a thread using the threading library, we need to pass the target to the thread without using brackets. e.g.

thread = threading.Thread(target=ThreadTest)

not

thread = threading.Thread(target=ThreadTest())

Otherwise the target is created in the main thread, which is what we are trying to avoid. Without the brackets, we pass a reference to the target. With the brackets, we have already created the object. I think that this is analogous to passing a pointer in C, but stand to be corrected.

Example

In test1.py I call ThreadTest without using brackets. test_thread starts in the thread and allows test1.py to continue running.

In test2.py, I pass ThreadTest() as the target. In this case the thread does not allow test2.py to continue running.

test1.py

import threading
from thread_test import ThreadTest

thread = threading.Thread(target=ThreadTest)
thread.start()
print('not blocked')

test2.py

import threading
from thread_test import ThreadTest

thread = threading.Thread(target=ThreadTest())
thread.start()
print('not blocked')

test_thread.py

from time import sleep


class ThreadTest():
    def __init__(self):
        print('thread_test started')
        while True:
            sleep(1)
            print('test_thread')

output from test1.py:

thread_test started
not blocked
test_thread
test_thread
test_thread

output from test2.py:

thread_test started
test_thread
test_thread
test_thread

Relative imports in Jupyter notebooks

How do we import a module from a .py or a .ipynb file into a Jupyter notebook from a different directory?
I wrote this post after answering a question on stackoverflow:
https://stackoverflow.com/questions/49282122/import-modules-in-jupyter-notebook-path/

For example, if we have the directory structure:

analysis.ipynb
/src/configuration.py
/src/configuration_nb.ipynb

How do we access the file configuration.py or the notebook configuration_nb.ipynb?

The nbimporter module helps us here:

pip install nbimporter

/src/configuration.py

class Configuration():
    def __init__(self):
        print('hello from configuration.py')

analysis.ipynb:

import nbimporter
from src import configuration

new = configuration.Configuration()

output:

hello from configuration.py

We can also import and use modules from other notebooks. If you have configuration_nb.ipynb in the /src module:

src/configuration_nb.ipynb:

class Configuration_nb():
    def __init__(self):
        print('hello from configuration notebook')

analysis.ipynb:

import nbimporter
from src import configuration_nb

new = configuration_nb.Configuration_nb()

output:

Importing Jupyter notebook from ......\src\configuration_nb.ipynb
hello from configuration notebook

Running pytest when the test files are in a different directory to the source files

I had a battle to get my testing directory structure to work outside of an IDE. Please find my solution below. Tested on Windows 7 using python 3.6 and Linux Mint using python 3.4, running the code using the command line:

python -m pytest test_compress_files.py

The file I wrote to be tested is called compress_files.py in a directory named \src. The file containing tests to be run using pytest is called test_compress_files.py in a subdirectory \tests, so the full directory path is \src\tests. I needed to add a file called context.py to the \src\tests directory. This file is used in test_compress_files.py to enable access to compress_files.py in the directory above. The __init__.py files are empty.

Directory structure:

\src
__init__.py
compress_files.py

\src\tests
__init__.py
context.py
test_compress_files.py  

compress_files.py contains the script to be tested.

context.py:

import os
import sys
sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))

import compress_files  

The line:

sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__)

comes from the suggestion at the hitch hikers guide to python at:

http://docs.python-guide.org/en/latest/writing/structure/.

This adds the path of the directory above the /src/tests directory to sys.path, which in this case is /src.

test_compress_files.py:

import os
import pytest
from .context import compress_files
from compress_files import *

# tests start here
...

I put this up as an answer to a stackoverflow question here.

Sending parameters to a Jupyter Notebook cell using click

Using libraries such as click and optparse we can send parameters to Python scripts when we run them from the command line. For example, passing a parameter called count with a value of 2 to a script called hello.py:

hello.py --count=2

How can I replicate this functionality in a cell of a Jupyter notebook? I like to run the same code in the notebook so that I can easily copy it to a stand alone script. Using sys.argv to pass parameters to the main function seemed one way to go and works with optparse:

from optparse import OptionParser
import sys


def main():
    parser = OptionParser()
    parser.add_option('-f', '--fake',
                      default='False',
                help='Fake data')
    (options,args) = parser.parse_args()
    print('options:{} args: {}'.format(options, args))
    if options.fake:
        print('Fake detected')
    
def test_args():
    
    print('hello')
    
if __name__ == '__main__':

    sys.argv = ['--fake', 'True' '--help']
    main()

output:

options:{'fake': 'False'} args: ['True--help']

Fake detected

Click seems to be flavor of the month, but I kept on getting a screen full of errors when I tried to run click through a Jupyter notebook cell. If we consider the Click example code:

import click

@click.command()
@click.option('--count', default=1, help='Number of greetings.')
@click.option('--name', prompt='Your name',
            help='The person to greet.')
def hello(count, name):
    """Simple program that greets NAME for a total of COUNT times."""
    for x in range(count):
        click.echo('Hello %s!' % name)

if __name__ == '__main__':
    hello()

If this file is called hello.py, then running from a command line:

hello.py 'Max' --count=3 

Gives this output:

Hello Max!

Hello Max!

Hello Max!

But using the same sys.argv trick that works with optparse produces a screen full of errors when the same code is run from a Jupyter notebook cell. The solution is to put the %%python magic at the start of the cell:

%%python

import sys
import click

@click.command()
@click.option('--count', default=1, help='Number of greetings.')
@click.option('--name', prompt='Your name',
            help='The person to greet.')
def hello(count, name):
    """Simple program that greets NAME for a total of COUNT times."""
    with open('echo.txt', 'w') as fobj:
        for x in range(count):
            click.echo('Hello %s!' % name)

if __name__ == '__main__':
    # first element is the script name, use empty string instead
    sys.argv = ['', '--name', 'Max', '--count', '3']
    hello()

A small tip, but one which cost me an hour or two of pondering. Finally I asked the hive mind of Stackoverflow. Please see this stackoverflow solution.

how to configure the accelerometer range on the microbit using micropython

This article details how to set the range of sensitivity on the accelerometer on the microbit board using micropython and the i2c interface. I am using v1.7.9 of micropython for the microbit, the mu editor and linux mint v17.
 

After listening to Joe Finney talk about his role in developing the microbit board I realised I could use it for some of my hand gesture assistive technology work. The accelerometer on the microbit board is an MMA8653FC, data sheet here. There are programming notes for this chip here. The default range for this chip is +/-2g. This can be reconfigured to be +/-4g or +/-8g. For some of the students I work with on gesture recognition I need the higher ranges. So I entered the world of microbit i2c programming. I chose the micropython platform as python is always the ‘second best choice’ for any programming application. Actually, I’m a fan of using C for embedded hardware, but in this case using micropython looked to be fastest way of getting a solution. I used the simple mu editor. Long story short, it’s all about syntax. Thanks go to fizban for his example microbit code to interface a microbit with an lcd display using i2c. After reading this code I fixed the mistake(s) I’d been making. The documentation for the i2c microbit micropython is here.

Here’s my working code:

''' microbit i2c communications with onboard accelerometer '''
from microbit import *

ACCELEROMETER = 0x1d
ACC_2G = [0x0e, 0x00]
ACC_4G = [0x0e, 0x01]
ACC_8G = [0x0e, 0x02]
CTRL_REG1_STANDBY = [0x2a, 0x00]
CTRL_REG_1_ACTIVE = [0x2a, 0x01]
PL_THS_REG = [0x14] # returns b'\x84'
PL_BF_ZCOMP = [0x13] # returns b'\44' = 'D'
WHO_AM_I = [0x0d] # returns 0x5a=b'Z'
XYZ_DATA_CFG = [0x0e]

def command(c):
''' send command to accelerometer '''
i2c.write(ACCELEROMETER, bytearray(c))

def i2c_read_acc(register):
''' read accelerometer register '''
i2c.write(ACCELEROMETER, bytearray(register), repeat=True)
read_byte = i2c.read(ACCELEROMETER, 1)
print('read: {}'.format(read_byte))

def main_text():
''' send accelerometer data as a string '''
print('starting main')
counter = 0
while True:
x = accelerometer.get_x()
y = accelerometer.get_y()
z = accelerometer.get_z()
counter = counter + 1
print('{} {} {} {}'.format(counter, x, y, z))
sleep(250)

print("sending i2c commands...")
print('reading PL_BF_ZCOMP :')
print(i2c_read_acc(PL_BF_ZCOMP))
print('reading WHO_AM_I')
print(i2c_read_acc(WHO_AM_I))
# check the initial accelerometer range
print('reading XYZ_DATA_CFG:')
print(i2c_read_acc(XYZ_DATA_CFG))
# change the accelerometer range
command(CTRL_REG1_STANDBY)
command(ACC_4G)
command(CTRL_REG_1_ACTIVE)
print('commands sent')
# check the accelerometer range
print('reading XYZ_DATA_CFG:')
print(i2c_read_acc(XYZ_DATA_CFG))
display.show(Image.MEH)
# main_text()

output:

reading PL_BF_ZCOMP :
read: b'D'
None
reading WHO_AM_I
read: b'Z'
None
reading XYZ_DATA_CFG:
read: b'\x00'
None
commands sent
reading XYZ_DATA_CFG:
read: b'\x01'
None

The onboard accelerometer has an i2c address of 0x1d. There is a good article on how to scan for and verify this address here. I set the variable ACCELEROMETER to be this value in line 4 so that I could refer to it throughout the code without having to remember the hex value. Too many hex values flying around – I’d be bound to make a mistake if I didn’t give them names.

To send a command over i2c, as shown in line 18 of the example code, you need to address the target then send the commands as a bytearray. In this case the target is the accelerometer. Typically we send two bytes to the accelerometer. The first specifies the register we want to change, the second the value we want to write to this register. For example, to set the accelerometer’s range of sensitivity, we need to set the value of the register called XYZ_DATA_CFG to the value that corresponds with the range we are after. The address of this register is 0x0e. To set the +/4G range, we want to set this register to be 0x01. Now the variable I set in line 6 should make sense. Look in the data sheet linked above for more details. Before we can change this register we have to set CTRL_REG1 to be inactive by writing 0x00 to it. After changing the XYZ_DATA_CFG register we have to set CTRL_REG1 to be active again by writing 0x01 to it. This is detailed in the accelerometer application notes which I linked at the start of this article.

If you uncomment the last line, then the raw accelerometer values will stream out. The last column are the values for the z-axis of the accelerometer. Lay the board flat on the table. With the default +/-2g range you will see the z-axis values being around +1024 or -1024 depending on if the board is face up or down. This corresponds to +/-1g on the +/-2g range. Now that the board is set to +/-4g, the values for +/-1 g will be +/-512. The maximum and minimum value for the accelerometer stays as +/-2048, but it is now spread over +/-4g. Similarly, if you go crazy and set the range to be +/-8g, then you will see +/-256 for the z-axis value from the accelerometer for the board laying flat. As you would expect, you have to wave the board harder to get it to max out when you set the sensitivity to the higher ranges compared with the default +/-2g range.

So what about the PL_BF_ZCOMP and WHO_AM_I registers that I read from in lines 43 and 45? These are two read only directories. Reading the values stored in these is a sanity check that the chip is turned on and I have working code. I read the XYZ_DATA_CFG before and after setting it to verify that the sensitivity range has been set. Read up on these registers in the data sheet.

Look at line 23. The repeat=True flag has to be set. This clears the ‘message end’ flag in the write command. The default for this flag is False, which means that the i2c write command has a ‘message end’ flag at the end of it, which terminates the operation. As we want to read from the chip in line 24, we need to not set the ‘message end’ flag. Otherwise you will just read 0xff. Can you guess why? The data line is held high for i2c, so if there is nothing coming out of the chip you are trying to read from, you just read a bunch of ‘1s’. Line 24 means ‘read 1 byte from the device with address ACCELEROMETER’.

Where I initially came unstuck was by sending data as individual bytes, using e.g. b’\x0e’ followed by b’\x02′ to try and change the XYZ_DATA_CFG register. This looks to be valid for the Adafruit implementation of micropython, but I couldn’t get it work.

parsing and unpacking python3 serial data containing double backslashes

I lost a day of my life figuring out how to parse serial data sent as bytes from the BBC Microbit using micropython. The problem is that the data byte string appears with double backslash characters instead of single backslashes when read in over a serial interface.
Actual data:

b'ST\\x00\\x00\\x00\\xe0\\xeaE\\x00\\x00HB\\x00\\x00`\\xc3\\x00\\x00\\x10C\\x00\\x00t\\xc4EN'

What I wanted as data:

b'ST\x00\x00\x00\xe0\xeaE\x00\x00HB\x00\x00`\xc3\x00\x00\x10C\x00\x00t\xc4EN'

So how to convert from one misformed byte string to the clean one that python 3 would use?
I really went around in circles on this one. In the end I used a kludge. But it works. My life can now move on.
I convert the double slash byte to a string. Then I use the replace method to replace ‘\\’ with ‘\’. Then I use the literal_eval function to recast it as a byte. I am open to suggestions for a cleaner way of doing this!
Here’s some example code I used in a jupyter notebook session. test2 is the misformed byte string received over the serial interface and test3 is the cleaned byte that I can now unpack and extract the data from.

from struct import *
from ast import literal_eval
PACKER = ('2s5f2s')
test2=b'ST\\x00\\x00\\x00\\xe0\\xeaE\\x00\\x00HB\\x00\\x00`\\xc3\\x00\\x00\\x10C\\x00\\x00t\\xc4EN'
test3 = str(test2)
test3 = test3.replace('\\\\', '\\')
print('{}'.format(test3))
test3 = literal_eval(test3)
print(test3)
print(unpack(PACKER,test3))

output:

b'ST\x00\x00\x00\xe0\xeaE\x00\x00HB\x00\x00`\xc3\x00\x00\x10C\x00\x00t\xc4EN'
b'ST\x00\x00\x00\xe0\xeaE\x00\x00HB\x00\x00`\xc3\x00\x00\x10C\x00\x00t\xc4EN'
(b'ST', 7516.0, 50.0, -224.0, 144.0, -976.0, b'EN')

The data was produced from reading the accelerometer on a BBC Microbit board then using

struct.pack(PACKER,scan).

I am programming the boards using micropython.
The data is packed using the packer format:

PACKER = ('2s5f2s')

The transmitted scan is constructed using:

values = (START, counter, DELTA, x, y, z, END)
scan = struct.pack(packer, *values)

Where values contains a START and END string (‘ST’ and ‘EN’ respectively), a constant called DELTA which represents the time in between samples and the x, y and z readings from the accelerometer. So PACKER means ‘2 characters followed by 5 floats followed by 2 characters’.
I was being obstinate in sending bytes over the serial interface instead of a string. Why use bytes and not just send a text string? Using the pack and unpack enforces a structure to the data packets and reduces the amount of data needed to be transmitted compared with a string. Consider a number ‘2048’ sent using the packer function. This is coded as an ‘f’ meaning a float. This is 2 bytes long. Sending ‘2048’ as a string would require 4 bytes, one for each of ‘2’, ‘0’, ‘4’ and ‘8’.
If I encode the string ‘ST 7516.0 50.0 -224.0 144.0 -976.0 EN’ using packer ‘2s5f2s’, the message is 26 bytes. If I send it as a string, it will be 37 bytes. Please see the example code and its output below.

from struct import *
PACKER = ('2s5f2s')
test = 'ST 7516.0 50.0 -224.0 144.0 -976.0 EN'
test2 = (b'ST',7516.0,50.0,-224.0,144.0,-976.0,b'EN')
print('string length: {}'.format(len(test)))
packed_data = pack(PACKER,*test2)
print('packed length: {}'.format(len(packed_data)))
print('unpacked data: {}'.format(unpack(PACKER,packed_data)))

output:

string length: 37
packed length: 26
unpacked data: (b'ST', 7516.0, 50.0, -224.0, 144.0, -976.0, b'EN')

The second reason for using pack and unpack for data packed transmission over sending a stream is that this enforces error checking. If the data is corrupted while reading from the sensor, then an error will be raised during the pack process at the transmitter end. If the data packet is corrupted during transmission, an error will be raised during the unpack process at the receiving end. This can be caught using a try-except clause.

parsing and unpacking python3 serial data containing double backslashes

edit 11th October 2017: The ‘eval’ statements in the code shown below can be replaced with the safer ‘literal_eval’ from the ast class in the standard library. From the python docs: ‘Safely evaluate an expression node or a string containing a Python literal or container display. The string or node provided may only consist of the following Python literal structures: strings, bytes, numbers, tuples, lists, dicts, sets, booleans, and None.’

This can be used for safely evaluating strings containing Python values from untrusted sources without the need to parse the values oneself. It is not capable of evaluating arbitrarily complex expressions, for example involving operators or indexing.

I lost a day of my life figuring out how to parse serial data sent as bytes from the BBC Microbit using micropython. The problem is that the data byte string appears with double backslash characters instead of single backslashes when read in over a serial interface.
Actual data:

b'ST\\x00\\x00\\x00\\xe0\\xeaE\\x00\\x00HB\\x00\\x00`\\xc3\\x00\\x00\\x10C\\x00\\x00t\\xc4EN'

What I wanted as data:

b'ST\x00\x00\x00\xe0\xeaE\x00\x00HB\x00\x00`\xc3\x00\x00\x10C\x00\x00t\xc4EN'

So how to convert from one misformed byte string to the clean one that python 3 would use?
I really went around in circles on this one. In the end I used a kludge. But it works. My life can now move on.
I convert the double slash byte to a string. Then I use the replace method to replace ‘\\’ with ‘\’. Then I use the literal_eval function to recast it as a byte. I am open to suggestions for a cleaner way of doing this!
Here’s some example code I used in a jupyter notebook session. test2 is the misformed byte string received over the serial interface and test3 is the cleaned byte that I can now unpack and extract the data from.

from struct import *
from ast import literal_eval
PACKER = ('2s5f2s')
test2=b'ST\\x00\\x00\\x00\\xe0\\xeaE\\x00\\x00HB\\x00\\x00`\\xc3\\x00\\x00\\x10C\\x00\\x00t\\xc4EN'
test3 = str(test2)
test3 = test3.replace('\\\\', '\\')
print('{}'.format(test3))
test3 = literal_eval(test3)
print(test3)
print(unpack(PACKER,test3))

output:

b'ST\x00\x00\x00\xe0\xeaE\x00\x00HB\x00\x00`\xc3\x00\x00\x10C\x00\x00t\xc4EN'
b'ST\x00\x00\x00\xe0\xeaE\x00\x00HB\x00\x00`\xc3\x00\x00\x10C\x00\x00t\xc4EN'
(b'ST', 7516.0, 50.0, -224.0, 144.0, -976.0, b'EN')

The data was produced from reading the accelerometer on a BBC Microbit board then using struct.pack(PACKER,scan). I am programming the boards using micropython.
The data is packed using the packer format:

PACKER = ('2s5f2s')

The transmitted scan is constructed using:

values = (START, counter, DELTA, x, y, z, END)
scan = struct.pack(packer, *values)

Where values contains a START and END string (‘ST’ and ‘EN’ respectively), a constant called DELTA which represents the time in between samples and the x, y and z readings from the accelerometer. So PACKER means ‘2 characters followed by 5 floats followed by 2 characters’.
I was being obstinate in sending bytes over the serial interface instead of a string. Why use bytes and not just send a text string? Using the pack and unpack enforces a structure to the data packets and reduces the amount of data needed to be transmitted compared with a string. Consider a number ‘2048’ sent using the packer function. This is coded as an ‘f’ meaning a float. This is 2 bytes long. Sending ‘2048’ as a string would require 4 bytes, one for each of ‘2’, ‘0’, ‘4’ and ‘8’.
If I encode the string:

'ST 7516.0 50.0 -224.0 144.0 -976.0 EN'

using packer ‘2s5f2s’, the message is 26 bytes. If I send it as a string, it will be 37 bytes. Please see the example code and its output below.

from struct import *
PACKER = ('2s5f2s')
test = 'ST 7516.0 50.0 -224.0 144.0 -976.0 EN'
test2 = (b'ST',7516.0,50.0,-224.0,144.0,-976.0,b'EN')
print('string length: {}'.format(len(test)))
packed_data = pack(PACKER,*test2)
print('packed length: {}'.format(len(packed_data)))
print('unpacked data: {}'.format(unpack(PACKER,packed_data)))

output:

string length: 37
packed length: 26
unpacked data: (b'ST', 7516.0, 50.0, -224.0, 144.0, -976.0, b'EN')

The second reason for using pack and unpack for data packed transmission over sending a stream is that this enforces error checking. If the data is corrupted while reading from the sensor, then an error will be raised during the pack process at the transmitter end. If the data packet is corrupted during transmission, an error will be raised during the unpack process at the receiving end. This can be caught using a try-except clause.

EWMA filter example using pandas and python

This article gives an example of how to use an exponentially weighted moving average filter to remove noise from a data set using the pandas library in python 3. I am writing this as the syntax for the library function has changed. The syntax I had been using is shown in Connor Johnoson’s well explained example here.
I will give some example code, plot the data sets then explain the code. The pandas documentation for this function is here. Like a lot of pandas documentation it is thorough, but could do with some more worked examples. I hope this article will plug some of that gap.
Here’s the example code:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
ewma = pd.Series.ewm

x = np.linspace(0, 2 * np.pi, 100)
y = 2 * np.sin(x) + 0.1 * np.random.normal(x)
df = pd.Series(y)
# take EWMA in both directions then average them
fwd = ewma(df,span=10).mean() # take EWMA in fwd direction
bwd = ewma(df[::-1],span=10).mean() # take EWMA in bwd direction
filtered = np.vstack(( fwd, bwd[::-1] )) # lump fwd and bwd together
filtered = np.mean(filtered, axis=0 ) # average
plt.title('filtered and raw data')
plt.plot(y, color = 'orange')
plt.plot(filtered, color='green')
plt.plot(fwd, color='red')
plt.plot(bwd, color='blue')
plt.xlabel('samples')
plt.ylabel('amplitude')
plt.show()

This produces the following plot. Orange line = noisy data set. Blue line = backwards filtered EWMA data set. Red line = forwards filtered EWMA data set. Green line = sum and average of the two EWMA data sets. This is the final filtered output.

EWMA fiiltered and raw data.

Let’s look at the example code. After importing the libraries I will need in lines 1-5, I create some example data. Line 6 creates 100 x values with values spaced evenly from 0 to 2 * pi. Line 7 creates 100 y-values from these 100 x-values. Each y value = 2*sin(x)+some noise. The noise is generated using the np.random.normal function. This noisy sine function is plotted in line 15 and can be seen as the jagged orange line on the plot.
Forwards and backwards EWMA filtered data sets are created in lines 10 and 11.
Line 10 starts with the first x-sample and the corresponding y-sample and works forwards and creates an EWMA filtered data set called fwd. This is plotted in line 17 as the red line.
Line 11 starts at the opposite end of the data set and works backwards to the first – this is the backwards EWMA filtered set, called bwd. This is plotted in line 18 as the blue line.
These two EWMA filtered data sets are added and averaged in lines 12-13. This data set is called filtered. This data set is plotted in line 16 as the green line.
If you look at the ewma functions in line 10 and 11, there is a parameter called span. This controls the width of the filter. The lag of the backwards EWMA data behind the final averaged filtered output is equal to this value. Similarly the forward EWMA data set has an offset forwards of the noisy data set equal to this value. Increasing the span increases the smoothing and the lag. Increasing the value will also reduce the peaks of the filtered data in relation to the unfiltered data. You need to try out different values.
My present application for this filter is removing jitter from accelerometer data. I have also used this filter to smooth signals from hydrophones.