Finished Freelancing engagements

This are projects already finished. They are here for historical purposes only. Check the current page

Build an Amazon EC2 Image with Ubuntu

Description:
Create An Amazon EC2 Machine Image (AMI) based on Ubuntu 9.10. Following software should be installed and running upon boot:

  • apache2 (http/https) including a self-signed SSL cert
  • MySQL
  • http://piwik.org/ in the Web-subdirectory /piwik
  • http://www.openx.org/ in the Web-subdirectory /openx
  • http://munin.projects.linpro.no/ (node and master) output in the Web-subdirectory /munin

Remove ALL unused apache modules, including mod_autoindex et al. Keep only needed Modules and mod_status. Add Basic hardening techniques, e.g. not mentioning the Apache Version in the “Server” Header.

Also install:

  • Python 2.6, iPython
  • Amazon EC2 Tools

The mysql database should be storable on an EBS device. Please provide scripts to mount them.

Platform:
Amazon EC2 (small instance), Ubuntu 9.10

Deliverables:
Access to an AMI image. When launched the image should be acessible based on the standard “provide ssh key during launch” method aused by the original Amazon AMIs (and 90% of the other AMIs).

* All deliverables must be uploaded to Rent A Coder before the deadline(s) for this project…with no exceptions. If this contract makes it impossible for a competent person to do this, then do not start this project…but instead alert Rent A Coder of an un-arbitratable, illegal project. * Remember that contacting the other party outside of the site (by email, phone, etc.) on all business projects < $500 (before the buyer’s money is escrowed) is a violation of both the software buyer and seller agreements. Rent A Coder monitors all site activity for such violations and can instantly expel transgressors on the spot, so we thank you in advance for your cooperation. If you notice a violation please help out the site and report it. Thanks for your help.

Create an IMAP2HTML Mail Archiver

Description:
There are several very old tools to convert Mail to static HTML pages, e.g. http://www.hypermail-project.org/ http://www.mhonarc.org/

Build a python based tool to read mails from an IMAP server and generate one HTMP page per message. Include Overview Pages by Sender, Day, Month, Year and Subject.

Ensure that HTML-Mails are handled well and Attatchments are saved and klickable from the HTML pages. Add a JAva Script based “show all mail headers” to the HTML pages.

Command line options should look like this:

remove-attachments
—server=myserver.example.com # IMAP hostname
—user=username # IMAP user
—password=pass # IMAP password
—outputdir=/var/www/mailarchive/
—mindate=2009-03-01 # only check messages send befor this date
—minsize=1500 # only check messages bigger than 1500 KB
—remove # remove / delete on IMAP server after processing

Platform:
Python 2.6, Unix/Linux

Deliverables:

  • Python 2.6 based application fullfilling the requirements above.
  • Including a requirements.txt file which can be used with the pip utility to install all dependencies.
  • Python Code must follow http://blogs.23.nu/disLEXia/2009/11/coding-guidelines/
  • Copyright of the code written by you for this project will be transfered to us

* All deliverables must be uploaded to Rent A Coder before the deadline(s) for this project…with no exceptions. If this contract makes it impossible for a competent person to do this, then do not start this project…but instead alert Rent A Coder of an un-arbitratable, illegal project. * Remember that contacting the other party outside of the site (by email, phone, etc.) on all business projects < $500 (before the buyer’s money is escrowed) is a violation of both the software buyer and seller agreements. Rent A Coder monitors all site activity for such violations and can instantly expel transgressors on the spot, so we thank you in advance for your cooperation. If you notice a violation please help out the site and report it. Thanks for your help.

Categories:
(Note: Like everything else on this page, these categories are part of the original contract for this bid request.)
Python

Django based Framework for FTP and HTTP upload processing

For data regularry send by our suppliers we need an combination of FTP-Server and Webapplication. Users upload files and when the upload is finished, files will be automatically processed by our system. Users have different options for upload:

  • FTP into a directory “to_hudora”
  • HTTP Form based upload (authenticated)
  • HTTP POST Based API using the “curl” command line tool (authenticated via OAuth and HTTP-Basic Auth. Use django-piston for it)

Users will be identified by a 13 digit number or a strink like “global-logistics.pl”. When a User is created he is assigned a 16 letter random password which is saved in cleartext and provisioned with the FTP server. this password is then mailed to the E-Mail address of the user. Do my understanding usernames contianing dots and dashes are problemetic with the default django contrib.auth application.

All files uploaded should be saved as attatchmend in a CouchDB together with timestamp, coneccting user, sha256 checksum, IP address.

Users should be provided with an authenticated page where they can see a log file of the files uploaded by them. They should be also able to see a detailed FTP debug log of all acitivity in the last few hours seen from the IP adress from which their browser is currently connecting.

Superusers should be able to inspect the pages/information of all users. USers should be unable to gain any inforation on other users

We also need the opposite file transfer direction. A file and a user id is provided by us, saved in couchdb and then put into a user FTP directory “from_hudora”. Besides FTP the users should be able to download and delete files via a web GUI or via a HTTP API.

Platform:
FreeBSD 7.2, Python 2.6, Django 1.0.x

Deliverables:
Implement the Functionality above using external libraries and OpenSource technology to your liking.

  • Files should be identified by unique IDs beeing created out of timestamp + sha256 of the file. I suggest base64.base32encode() to get a nice representation of the GUID. Use them e.g. as ID in couchdb.
  • Use http://bitbucket.org/jespern/django-piston/
  • call a dummy-function message_arrived(uuid, username, datastream) when a message arrived. E.g. message_arrived('234FDS234GFSD’, 'global-logistics.pl’, '). We will later swap theis function with code connecting to our internal software
  • provide a function send_message(uuid, username, datastream) for us to call to put a file into the system.
  • Add 100% Audit logging to the web-interface.
  • Users should have following so far unused fields in theier data model: webhook_url, upload_url.
  • Use plain white HTML pages without any additional web-Design
*Use SSL/TLS for FTP and HTTP. No unprotected communication. Use self signed certificates for testing.

Follow http://blogs.23.nu/disLEXia/2009/11/coding-guidelines/ – especially apply pep8 and pylint.

Extract Images from an IMAP Account

We need an Python Command Line Application which connects to an IMAP4 server and scans all messages (in all subfolders) of a certain account. If the messages are older than a certain date and have Attatchments bigger than a certain size
the attatchmens should be extracted from the Message.

In addition there should be an option to remove the attatchment from the message (but keep the message) on the server (to save space on the server).
Output Should be put in CouchDB (see http://code.google.com/p/couchdb-python/#) where you should put To:, From:, Date:, Subject, In-Reply-to, References and Message-Id in the document. The Attatchment should be added as couchdb Attatchment conserving the mime type and file name.

Command line invokation might look like this:

remove-attachments
—server=myserver.example.com # IMAP hostname
—user=username # IMAP user
—password=pass # IMAP password
—couchdb=http://localhost:5984/maildb # save to couchdb running here
—mindate=2009-03-01 # only check messages send befor this date
—minsize=1500 # only check messages bigger than 1500 KB
—remove # remove attatchments

If an Attatchment is removed instead of it there should be put a note in the message that an attatchment has been removed.

Since we will have extremely big (> 50GB) Mail Folders the tool must be able be run several times on the same account without generation duplicates in CouchDB.

Platform:
Python 2.6, Unix/Linux

Deliverables:

  • Python 2.6 based application fullfilling the requirements above.
  • Including a requirements.txt file which can be used with the pip utility to install all dependencies.
  • Python Code must follow http://blogs.23.nu/disLEXia/2009/11/coding-guidelines/
  • Copiright of the code written by you for this project will be transfered to us
* All deliverables must be uploaded to Rent A Coder before the deadline(s) for this project…with no exceptions. If this contract makes it impossible for a competent person to do this, then do not start this project…but instead alert Rent A Coder of an un-arbitratable, illegal project. * Remember that contacting the other party outside of the site (by email, phone, etc.) on all business projects < $500 (before the buyer’s money is escrowed) is a violation of both the software buyer and seller agreements. Rent A Coder monitors all site activity for such violations and can instantly expel transgressors on the spot, so we thank you in advance for your cooperation. If you notice a violation please help out the site and report it. Thanks for your help.

Modify “Nutch,” our Java based Web-Search engine

Crawling would be accomplished by something like ./bin/nutch crawl starturls.txt -dir crawl -depth 2 -topN 30000 and the HTML interface by dropping nutch-1.0.war into you favorite servlet container (I use Jetty).

Your task is to buils a JSP single page allowing to view statistis about the current search index. For that you need to use the lucene API. Probably the study of the sourcecode of the tool “Luke” can show you exactly how to query the index (see http://www.getopt.org/luke/#)

The page should display

  • number of documents
  • number of terms
  • index last modified. Date in http://www.faqs.org/rfcs/rfc3339.html format
  • Any statistics you can get on the crawldb. http://is.gd/4Q7Jp http://issues.apache.org/jira/browse/NUTCH-558 and http://is.gd/4Q7Ny might provide pointers

This page will be used by us to monitor if the nutch instance is “healty”, still adding pages etc. Nutch is run on an intranet spidering about two dozen hosts.

Platform:
FreeBSD 7, JBK 1.6, nutch 1.0

Deliverables:

  • JSP Page displaying statistics.
  • If you need a newer version of nutch than 1.1 please provide us with the whole nutch installation
  • Use OpenSource Libraries where they are available. If you copy OpenSource code please mark it clearly and mention the License of the the included code.
  • Copyright of the Code written by you for the project will be assigned to us. We might OpenSource the code if we consider it of general interest.
  • During development you will not get access to our servers, accounts, resources. Installation will be handled by us according to the documentation we provided.

Implement Paypal Express Checkout in Python/Django.

We require a minimal prototype implementation of Paypal Website Payments Pro integration into an Django. No Product Choice, Catalog, Shopping Cart or user registration functionality is required.

Platform:
Python 2.6, Django 1.0.x, PostgreSQL/SQlite

Deliverables:
Build an application where the Web Surfer can order a single Product “testproduct” and let the user choose to use Paypal Express Checkout or “Creditcard”. If Paypal Express Checkout is choosen collect payment and address information via Paypal Express Checkout. If “Creditcart” is choosen, collect Address information in Django and then collect Payment information via Paypal. See https://cms.paypal.com/cms_content/US/en_US/images/developer/combinedECoptions.gif for the two routes of payment flow. At http://is.gd/4zr2M there is plentyful further information.

Requirements:

  • Implementation of Paypal checkout process including minimal shop envirenment needed for prototyping.
  • Support for Direct Payment and Express Checkout
  • Support of (dummy) logo on the PayPal site (HDRIMG)
  • Support for a Shop Generated INVNUM and CUSTOM field
  • Display of order Details on the PayPal Page
  • Name-Value Pair (NVP) interface
  • Support for providing and handling CANCELURL
  • Use Authorization/Capture cycle (instead of direct Sale).
  • An internal interface to View Authorization and initiate Capture or Cancelation.
  • No support for Giropay required
  • Currency is “EUR”
  • Database enginge to be used is sqlite3
  • You need to get all accounts, servers, sandboxes, etc. for yourself during development.
  • This is a prototype implementation. We will integrate it into our codebase ourself.

Use of OpenSource libraries (GPL/BSD, etc.) is permitted. You might want to check http://github.com/johnboxall/django-paypal http://uswaretech.com/blog/2008/11/using-paypal-with-django/ http://www.djangosnippets.org/snippets/1181/ or http://github.com/defrex/django-paypal-cart

Deliverables:

  • Django 1.0.x based application fulfilling above requirements and runnable in the paypal sandbox
  • Including a requiremets.txt file which can be used with the pip utitity to install all dependencies
  • Plain HTML pages contianing the needed forms/UI (no Webdesign needed).
  • Python Code must follow http://blogs.23.nu/c0re/2007/06/antville-15208/ including unittests

Build a Django Based URL shortening Service like e.g. tinyurl

  • Web-Form to “compress an address”, including functionality to “whish for” an custom address, like http://tinyurl.com/
  • The actual redirector
  • an API like http://is.gd/api_info.php
  • No Admin funtionality is needed

Platform:
Python 2.6, Django 1.0.x

Deliverables:

  • Django 1.0.x based application fulfilling above requirements.
  • Should work with PostgreSQL. If you prefer you can use CouchDB instead.
  • Including a requirements.txt file which can be used with the pip utility to install all dependencies.
  • Plain white HTML pages containing the needed forms/UI.
  • Python Code must follow http://blogs.23.nu/disLEXia/2009/11/coding-guidelines/
  • Use OpenSource Libraries as you whish
  • Copyright of the Code written by you for this project will be assigned to us

Automatically compare names in lists

There is an official sanction list of the Euopean Union available at http://ec.europa.eu/external_relations/cfsp/sanctions/list/consol-list.htm http://ec.europa.eu/external_relations/cfsp/sanctions/list/version4/global/global.xml#. We need an Django application to compare this list with the names of our business partners, employees and alike to check if their names are on that List.

Tasks:

  • On user Click (“Update List”) download and parse the list and probably cache it in the Database.
  • If the user enters a Single name check if there is any match of theis name with
– “WHOLENAME” – “LASTNAME, FIRSTNAME” – “FIRSTNAME LANSTNAME” – “FIRSTNAME MIDDLENAME LASTNAME”
  • No match on Gender, address, Title etc.
  • If there is a match display all information the matched , current timestamp, input name and version of the XML-File used.

else display “no match”
  • In addition allow the user to enter several lines in a “textarea” or upload a file and check line by line if there is a match to the Embargo list. Display the information required above and a summary (“n entries checked, m matches”, etc.)

Platform:

  • Python 2.6 on FreeBSD, Django 1.0.x

Deliverables:

  • Django 1.0.x based application fulfilling above requirements.
  • Should work with sqlite and PostgreSQL. If you prefere you can use CouchDB instead.
  • Including a requirements.txt file which can be used with the pip utility to install all dependencies.
  • Plain HTML pages containing the needed forms/UI.
  • Python Code must follow http://blogs.23.nu/disLEXia/2009/11/coding-guidelines/.
  • Your free to use OpenSource libraries. All rights on the code written by you will be transfered to us.

Extend, Modify or clone the Django authentication system

Extend, Modify or clone the Django authentication system to

  • export user data a) to LDAP and b) to Google Apps (see http://code.google.com/intl/de/apis/apps/gdata_provisioning_api_v2.0_reference.html#) Data should be exported on save/create of any record.
  • use CouchDB instead of an SQL Database to store User Account information. The user’s E-Mail-Address shoule be used in the “username” field, so user’s dont have to remembern an additional username.
  • Provide a simple application to change/reset Password (this can reuse 99% Django code)
  • Provide a require_login decorator which checks if the user is already logged in and if not displays a login form. (You again can reuse a lot of django code for that.) The user should NOT be asked for his username but for his E-Mail Address instead.

During development we can NOT give you access to a Google Apps account for testing – you have to get one yourself.

Parts:

  • New User Model compatible with Djangos Uder Model
  • Application where an Admin can create and delete user (might use the Djaongo admin interface)
  • Application where Users can change ther Passwort or reset ther password.
  • Password reset should be implemented by emailing the user a one time link valid for 48 h which when followd allows the user to set a new password. When the Password has been changed, rend an informational email about the fact. The technique is described at http://is.gd/4OfsH and called “Weak Technique C – Emailing instructions on how to reset password” (yes I want you to implement a “weak” techmique
  • Sample of the login_required decorator

Platform:

Pytohn 2.6, Django 1.0.x

Deliverables:

Django 1.0.x based application fulfilling above requirements.
Should work with sqlite and PostgreSQL. If you prefere you can use CouchDB instead.
Including a requirements.txt file which can be used with the pip utility to install all dependencies.
Plain white HTML pages containing the needed forms/UI.
Python Code must follow http://blogs.23.nu/disLEXia/2009/11/coding-guidelines/.
We encourage you to use existing OpenSource libraries where apropriate You have to sign over the Copyright of all code written by you for this Projet to us.

Create calculator for optimal packing

We have an application which calculates optimal packing of crates on a pallet and renders a packing example. See an example at http://www.hudora.de/code/palettenpacker/?palettenhoehe=1805&kartonmasse_x=320&kartonmasse_y=420&kartonmasse_z=360

The current source code is available at http://cybernetics.hudora.biz/dist/misc/palcalc-r2104.tar.gz

Currently rendering is done by generating an scene description for the ray-tracing package POV-Ray and calling the external ray-tracer. This is slow and results in many server-side dependencies.

Your task is to redo the rendering in Python without the need for external programs. The new rendering does not need to be ray-traced or have textures but has to to enough surface shading to make out the individual crates. You don’t have to implement text within the image.

Output should be PNG. There is a Bonus if you also provide SVG output.

You are free to use OpenSource Libraries. You are also free to re-use as much of palcalc as possible.

Platform:
Pytohn 2.6, FreeBSD, Django 1.0

Deliverables:

  • Python 2.6 based applicationgeneration output similar to palcalc and sorrounding minimal Web Application.
  • Including a requirements.txt file which can be used with the pip utility to install all dependencies.
  • Python Code must follow http://blogs.23.nu/disLEXia/2009/11/coding-guidelines/.
  • Copiright of the code written by you for this project will be transfered to us

Build a simple Django Frontend for Hylafax

Hylafax is an OpenSource Fax server. See http://www.hylafax.org/ for Details. We require a Django based Wen Forntent where you can

  • Upload a PDF and send it to a Fax number
  • Enter some Text, choose a pre defined Fax numer and Send it to a Fax number
  • See a log file of faxes send out by this web application

There is lot of OpenSource software implementing this (and much more) functionality. You are free to incorporate Ideas from the OpenSource Software listed at http://www.hylafax.org/content/Web_Based_Faxing and http://www.hylafax.org/content/Handbook:Server_Operation:Sending_Faxes#Cross-Platform_Clients
The actual Fax sending will be done by Hylafax.

Deliverables:

  • Django 1.0.x based Web forntend for faxing and viewing LOGs
  • Control of the hylafax server via the Hyla FAX client-server protocol. See http://www.hylafax.org/man/6.0.3/hfaxd.1m.html
  • Detailed instructions how to confirure Hylafax to work with the frontend (not modem setup, we already have that running)
  • Including a requiremets.txt file which can be used with the pip utitity to install all dependencies
  • Database will be PostgreSQL 8.x
  • Plain but valid HTML pages contianing the needed forms/UI.
  • Python Code must follow http://blogs.23.nu/c0re/2007/06/antville-15208/ including unittests

Ajax Image upload/scaling/rotation/positioning

We need a implementation of a customer facing Web-Application for

1. uploading a JPEG image of several MB
2. fitting the image within a hardcoded shape/mask
3. allowing the user to interactively position the image within the shape allowing him to scale, move and rotate the image.
4. generating the resutling images (JPEG) and storing the transformations applied (text).
5. No focus on HTML/CSS, just a white label interface.

We want customers to be able to upload image an image and to position them on objects (Skateboards, Skate Wheels, T-Shirts). Unlike other printing companies we do full body printing. Therefor the upload tool needs to know the shape of the object (delivered as an image mask to the application) and has to allow the user to position the uploaded image within the mask. It also has to allow the user to rotate and zoom the uploaded image.

See http://s.hdimg.net/rapael_test/rapael.html for an example for possible placement technology.

Implementation environment:

  • We need a prototype implementation only
  • Server Side code must be done in Python, preferably Django 1.0.x
  • Client Side code must be done in jQuery 1.3.x and/or http://raphaeljs.com/
  • Code must be compatible with IE »= 6.0, Firefox »= 2.0, most recent Cromium, Opera and Safari.
  • Python Code must follow http://blogs.23.nu/c0re/2007/06/antville-15208/ except that no unittests are required.

Deliverables:

  • Django based application handling the backend processing
  • Including a requiremets.txt file which can be used with the pip utitity to install all dependencies
  • HTML pages contianing the needed Javascript code.
  • Two alternate example image masks, e.G. circle and triangle. Alternatively you can use the image used in http://s.hdimg.net/rapael_test/rapael.html
  • Ability to upload an image, position it in the mask using Mouse drag AND buttons, zoom/unzoom using buttons, rotate using buttons.
  • Ability to save the applied transformation (e.g. “move 10,34; scale 0.78; rotate 90”) and generate an composite image as displayed in the browser on the server side.

Redesign two Webpages

We seek proposals on a new look for http://www.hudora.de/. The new look should be brighter, friendly and modern with a touch of Web 2.0 – but no rounded corners required. Please see the product packaging styleguide to get a feeling of our look and CI . We don’t require you to copy the packaging design – especially not the ellipse to the upper left, but see pages 8 and 10 for our logos and general color scheme.

The page should leave room for an additional product group based color as shown on page 2 of the style guide. Room for a 180×180 page specific image would be nice but is no requirement.

Our Web side is organize into three main topics:

  • “I have a HUDORA product and need spareparts/support”
  • “I want to inform myself about HUDORA products and where to buy them”
  • “I want to inform myself about the HUDORA company”

This should be prominently reflected on the homepage probably by icons. I think mturk.com solves this nicely. The homepage also should leave some room for news.

All pages need an contact/imprint link somewhere.

If you are interested in the company background, see here.

Please submit two HTML-Pages: The welcome page and one sub-page.

We require a clean XHTML + CSS based layout rendering nicely on IE7+, Firefox 2+, Safari 2+ and acceptable on IE 5+, Firefox 1+ and Opera 8+.

Train Tesseract OCR 2.0

Tresseract OCR (http://code.google.com/p/tesseract-ocr/#) is an Open Source OCR Application. Your Task is it to train it for Language and Fonts used at HUDORA. See http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract for details on the Training Process.

Build a file search system based on Solr

Solr is an open source enterprise search server with an HTTP Interface – see http://lucene.apache.org/solr/ for details. Your Task is is to bold a sear engine for files on a local fileserver based on Solr. This task is twofold: first you have to build an indexer which can traverse a list local directories, convert the files therein (DOC and PDF) to plain text and feed them to Solr for indexing.

The second task is to build a simple interface to that index in Django where you can query the Index in Solr.

Implementing TagField

Building on generic relations fields/generic.py implement a TagField including decent support in admin. Functionality should be about the same as the different RoR acts_as_taggable implementations but generic. e.g.:

  • picture.tagsting = ?animal horse sunset?
  • picture.tags = [<tag animal>, <tag horse>, <tag.sunset>]
  • Picture.objects.tagged_with([?horse?, ?sunset?])
  • Tags.objects.tagged_with([?horse?])
  • Tags.objects.tagged_with([?horse?]).filter(contenttype=Picture)

It also should have that Tag Cloud craze

The implementation will be under the BSD License should adhere to django coding standards. You should work with the django community to try to get it integrated into the django trunk.

Update the DoDoStorage System

DoDoStorage is a System to store Documents. You can see it described at http://blogs.23.nu/c0re/stories/15294/ The backend doing the actual storage is implemented using sqlalchemy (all in all 395 LoC). Your task is to remove the dependency on sqlalchemy and use direct DP-API access compatible with psycopg2 (postgres) and sqlite3. Also you have to update unittests, documentation and setup scripts accordingly. your modifications should be schema neutral, so the original database can be used unchanged.

You also should update the frontend to use a multithreadded python only http server like CherryPyWSGIServer instead of wsgiref.simple_server

As a final task please create a Project for DoDoStorage at http://code.google.com/hosting/createProject with you and me as admins. Add the code and content from http://cheeseshop.python.org/pypi/DoDoStorage/


You have to follow the coding standards outlined in http://blogs.23.nu/c0re/stories/15208/

The current version of DoDoStorage is available at http://static.23.nu/md/Pictures/DoDoStorage-0.3.dev-r2247.tar.gz

Scrap Information from Websites

Your task is to write a Python web-scrapper collecting pricing information on the Internet.
For a list of search terms (variable, several hundred) search 6 e-commerce sites and save the returned product information in a SQL database. For each product/entry found save a) the name/itemID/UPC/EAN b) who sells it c) for how much d) new or used e) where is the seller located if available f) historical pricing data if available g) the URL you got this information from.
While doing so ensure that you don’t request to many pages per minute from each site. All six sites are in german language, but you will be provided with the a translation of the relevant terms.

Port of a Ruby on Rails trouble Ticket System to Django

We have a “legacy” trouble ticket and spare part shipping system which is based on Ruby on Rails consisting of about 3750 LoC. Your task is to take that code and turn it into two Django applications: one for trouble Tickets and one for shipping spare parts. You also should improve on documentation and unit tests. The application contains several more interesting components like:

  • sending, receiving and parsing emails
  • interfacing to external systems via XML-RPC
  • generation of PDFs for invoices
  • matching of bank account transactions with open invoices.

Bin packing implementation

This is the real world implementation of a nice, hard computer science problem. It is about deciding how we pack things before shipping in our warehouse. See http://www2.toki.or.id/book/AlgDesignManual/BOOK/BOOK5/NODE192.HTM and http://www.ams.org/featurecolumn/archive/bins1.html for some background.

We have a list of product boxes (“a shippment”) all having a weight and a set of dimensions. Some of them have to be put into an additional box, some already come in an box stable enough for shipping. The data you get is something like
[(id=p1, weight=5600, dimensions=100×200×300, stable=False), (id=p2, weigth=10500, dimensions=100×400×300, stable=true), (id=p3, weigth=14000, dimensions=200×200×300, stable=false)].

We have several different sizes of shipping crates. You can assume that there is an infinitive supply of this shipping crates. No crate being shipped can be more heavy than 31500g. You get a list of available box sizes and weights like
[(id=c1, weight=1200, dimensions=200×200×200), (id=c2, weight=1300, dimensions=300×300×300), (id=c3, weight=1400, dimensions=300×400×600)].

Your task is to find a way to pack the product boxes in a way that we need to ship as few units as possible.

So for example for the input above a solution would be:

[(c3, weight=700, products=(p1)), (c3, weight=11900, products=(p2)), (c3, weight=15400, products=(p3))] is the dumbest solution: put each product in its own crate.

Slightly better:
[(p2, weight=10500), (c3, weight=21000, products=(p1, p2))] which means:
ship p2 without crate, pack the rest in the biggest crate (c3) available.

This is also obviously not the best solution. [(c4, weight=31500, products=(p1, p2, p3))] would be an better solution.

Since this problem domain is NP-hard the code is not expected to always find the best solution in finite time. Therefore it must be able to terminate after a given time and give “the best possible” result. We expect data sets of up to 80 product boxes (usually much less – around 15) and a upper runtime limit of 300 seconds. But the algorithm should be also able to get reasonable results within 30 seconds.

A set of real world test data will be provided when the project starts and is to be kept confidential.

Port of a Ruby on Rails Addressbook Application to Django

We have a little Ruby on Rails based Addressbook with 77 LoC of models, 130 LoC of controller and arround 150 lines of template. Your task is to take that code and turn it into an Django application. Noteworthy features include images per address and generation of vCards (use http://vobject.skyhouseconsulting.com/ for that in Python)

Port a Python application from one Database Mapper to another (Django – SQLalchemy)

I have a small but database heavy Python application (730 lines of code (LoC)) which accesses a PostgreSQL database using the database mapper supplied with the Django framework. In addtion I have about 1050 LoC test code for this application.

Your Task is to port this application from using the Django DB mapper to using the SQLalchemy DB mapper (still using PostgreSQL) and while porting you should clean up the code, add tests and ensure 100% test coverage. You might introduce changes to the database layout if this helps with the porting effort, since no real world data has to be converted.

This project has very high focus on code quality and I expect well written code comming with extensive unit tests. Unit tests should in the end tally about twice as much LoC than the actual code. The code must be rock solid and also cosider parallel access to the database by dozens of writers.

Pointers:
http://www.sqlalchemy.org/
http://www.djangoproject.com/documentation/model_api/
the software to be converted: http://www.hosted-projects.com/trac/hudora/public/browser/myPL/mPLtest/myPL/models.py
the tests to be converted: http://www.hosted-projects.com/trac/hudora/public/browser/myPL/mPLtest/test/test_mPL.py
example coverage tests: http://svn1.hosted-projects.com/hudora/public/huBarcode/test/test_coverage.py
pyLint: http://www.logilab.org/857

Find how to send out a packet (logistics)

For a shipping related application I need to print “routing information” on labels. The full specification is 25 pages but fairly simple (see ZIP). I guess so far I have implemented 50 % of this specification. (See PDF in ZIP page 15 – the right hand site is implemented).

Your task is to implement the rest of the spefication, write unittests for all functionality and ensure the existing and the new code adheres to the specification.

The code produced should be in Python and follow PEP 8 – “Style Guide for Python Code” and result in a pylint score of 8 or more. It also should come with extensive Unittests. The unittests should use coverage.py to see what is covered by the tests.

PDF spec
Current implementation

Write an generator for EDIFACT PRICAT/IFTMIN Messages

Implementing various EDIFACT/EANCOM exporters EDIFACT is a standard to exchange commercial messages. EANCOM can be interpreted as a subset of EDIFACT. It is an inelegant, text-only protocol. We want a framework implemented in Python to generate several different EANCOM messages:

1. Implement an EDIFACT/EANCOM PRICAT exporter.
PRICAT is an EDIFACT/EANCOM standard to exchange product cataloge and pricing information. We have an in-house application providing product information as an Python object with about 250 data-fields/methods per product. We have already implemented several exporters to formats like XML, CSV and XLS. Your task is to implement a small framework in Python for writing EDIFACT messages and then built on top of that an PRICAT exporter. Documentation on EANCOM/PRICAT is available on the Internet but can be also provided by us.
2. Implement an EDIFACT/EANCOM IFTMIN exporter
IFTMIN is a standard to basically tell a trucking or shipping company what you want them to transport and where it should be transported. It is an inelegant, text-only protocol. We have an in-house application providing shipping information as an Python object. We have already implemented an exporter to FEDAS/XML. Your task is to build an IFTMIN exporter. Documentation on EANCOM/IFTMIN is on the Internet but also can be provided.

The attached zip contains example data to give you an idea of the object structure used. It also contains example code to see how we work with our current objects. This code is purely informational to you and not extremely important even when claimed so by ROC.

To get an general understanding of EDIFACT you might want to check the following links:

  • http://www.edifactory.de
  • http://www.ean.se/EANCOM_2002/ean02s4/user/part2/pricat/examples.htm
  • http://www.ean.se/EANCOM_2002/ean02s4/user/part2/iftmin/examples.htm

This links are background information only and not part of the contract.

Barcode Recognition Software

Create an Python programm or a C extension to Python that finds and decodes barcodes in pictures. Your library should take an PIL Image object as an parameter and return a list of all found barcodes of type “code128” from the image.

PIL: http://www.pythonware.com/products/pil/
Code 128: http://www.adams1.com/pub/russadam/128code.html
Barcode Decoding: http://www.mperfect.net/barCode/

datamatrix barcode generation in python

datamatrix is a 2d barcode format. I require an implementation (or port of existing library) to encode urls (meaning lowercase letters, numbers and [.-/: ] in datamatrix barcodes.

The implementation must be in pure Python. There is already an OpenSource C-Language version of datamatrix generation called libdmtx and Python bindings to that C library called pydmtx but datamatrix generation without additional C code is needed by me.

Since libdmtx is OpenSource and the results of your work will be also open sourced, you may feel free to directly port the datamatrix generation code in libdmtx over from C to Python.

Your Code needs to generate the final images via PIL, the Python imaging library.

Decoding of datamatrix barcodes is not part of this project.

Further introduction in what datamatrix is can be found at
http://www.libdmtx.org/resources.php
http://grandzebu.net/informatique/codbar-en/datamatrix.htm
(This links are purly informational and not part of the contract).

QRcode barcode generation in python

ORcode is a 2d barcode format. I require an implementation (or port of existing library) to encode urls (meaning lowercase letters, numbers and [.-/: ] in QRcode.

The implementation must be in pure Python. There are already several implementations in different languages. The Pythonimplementation should become part of the huBarcode project and QRcode should have the same interface/API as huBarcode.datamatrix.

Your code needs to generate the final images via PIL, the Python imaging library – please see the deliverables for further details.

Decoding of datamatrix QRcode is not part of this project.
Further Information:
http://en.wikipedia.org/wiki/QR_Code (links at the bottom point to implementation examples)
http://www.hosted-projects.com/trac/hudora/public/wiki/huBarcode
http://megaui.net/fukuchi/works/qrencode/index.en.html http://swetake.com/qr/ruby/qr_rb.html
http://swetake.com/qr/php/qr_php.html
http://swetake.com/qr/qr_cgi.html
http://swetake.com/qr/java/qr_java.html

EAN128 barcode generation in Python

I’m looking to the follow up to our datamatrix project. Based on the same library structure, coding standards etc. I would like to see EAN-13 and code128 generation. EAN-13 needs a facility to write the EAN = Text below the barcode. (See http://www.hudora.de/media/,/barcodes/ean/4005998148501.png for an example)
There should also be a distutils setup.py.
http://www.reportlab.com/ftp/extensions/rlbarcode-0.9.2.zip might be a nice base for inspiration.

write python unittests for a few classes

I have 8 or so classes in an Python/Django Project which lacking unittests. Your task is to write unittests for each of these classes. The attatched ZIP contains the (well documented I like to think) sourcecode and some examples on how to build the unittests taken from Django. $ wc -l fields/*.py middleware/*.py
10 fields/__init__.py
100 fields/audit.py
49 fields/defaulting.py
263 fields/scalingimagefield.py
0 middleware/__init__.py
45 middleware/threadlocals.py
467 total

See http://svn1.hosted-projects.com/hudora/public/huDjango/

Build a simple Django and AJAX/mochikit Demonstrator(repost)

Your task is to build a simple proof of concept application using Django and AJAX/MochiKit. The application would consist of a single page where you can add to and remove items from a list and this actions would be communicated to the server vie XMLhttpRequest and reflected in the DB. The server also should be able to send back status messages (“can’t create entry” etc.) which should be displayed in the page.
In addition I want a shipping method (select box ) to be added to each item, which when changed updates the data on the server.? ie. the ShippingItem table will have one additional column `shippingmethod`.
This all should happen without reloading the page. You should implement the server (Django/Python) and Client (HTML/Javascript/MochiKit) components of the system. See http://www.mochikit.com/

convert 121 LoC java to jython

Convert a 121 line Java program which uses JasperReports to generate PDF files to jython (that’s java based Python) See src/XmlJasperInterface.java in the attatched file.
The attatched file contains source, all JAR files and is ready to run – and thus so big. It does not contain additional important information. If your download is slow, contact me and I post only the sourcecode.

JasperRepors Compiler

JasperReports is a Java library to turn data into PDFs. How to do that is descried in an XML file (.jrxml) this file has to be translated by using a library function in jasper to a compiled form (.jasper) and than can be used in generation PDF.
Your task is to write a small command line tool in java using the JasperReports Library to convert a jrxml file into a jasper file.

Calculate the number of weekdays between two dates

Implement something like this:
calculate_weekenddays_between(timestamp1, timestamp2): calculates the number of saturdays and sundays between timestamp1 and timestamp2

calculate_nonweekendhoursbetween(timestamp1, timestamp2): calculates the hours between two timestamps excluding saturdays and sundays. This probably would be: timestamp2 – timestamp1 – calculate_weekenddays_between(timestamp1, timestamp2)

Both Functions should operate on Python datetime objects and return timedelta objects.

Write an generator for EDIFACT PRICAT/IFTMIN Messages

mplementing various EDIFACT/EANCOM exporters EDIFACT is a standard to exchange commercial messages. EANCOM can be interpreted as a subset of EDIFACT. It is an inelegant, text-only protocol. We want a framework implemented in Python to generate several different EANCOM messages:

1. Implement an EDIFACT/EANCOM PRICAT exporter.
PRICAT is an EDIFACT/EANCOM standard to exchange product cataloge and pricing information. We have an in-house application providing product information as an Python object with about 250 data-fields/methods per product. We have already implemented several exporters to formats like XML, CSV and XLS. Your task is to implement a small framework in Python for writing EDIFACT messages and then built on top of that an PRICAT exporter. Documentation on EANCOM/PRICAT is available on the Internet but can be also provided by us.
2. Implement an EDIFACT/EANCOM IFTMIN exporter
IFTMIN is a standard to basically tell a trucking or shipping company what you want them to transport and where it should be transported. It is an inelegant, text-only protocol. We have an in-house application providing shipping information as an Python object. We have already implemented an exporter to FEDAS/XML. Your task is to build an IFTMIN exporter. Documentation on EANCOM/IFTMIN is on the Internet but also can be provided.

The attached zip contains example data to give you an idea of the object structure used. It also contains example code to see how we work with our current objects. This code is purely informational to you and not extremely important even when claimed so by ROC.

To get an general understanding of EDIFACT you might want to check the following links:

  • http://www.edifactory.de
  • http://www.ean.se/EANCOM_2002/ean02s4/user/part2/pricat/examples.htm
  • http://www.ean.se/EANCOM_2002/ean02s4/user/part2/iftmin/examples.htm

This links are background information only and not part of the contract.

convert a Python Script to run as a service on Win2k3

I have a small Python script (40 LoC) which runs permanently un a windows server. It gets requests via a Python specific protocol, and forwards them via ODBC.
Your task is to change the script to run as a Windows service so it is started automatically when the machine boots.
You als should modify the script so it does not use hardcoded password and username but read it from an SoftM_ODBCadapter.ini file , which should be in the same directory as the Python script.
Find attatched the script including all libaries or check it at http://www.hosted-projects.com/trac/hudora/public/browser/pySoftM/server/SoftM_ODBCadapter.py

Build an Python interface to OpenGeoDB

OpenGeoDB is a database with ZIP/Area codes, and Geolocations for Germany. Your task is to build a Python based interface to opengeodb. Four Kinds of functionallity are required:
1) Given a ZIP-Code, dump all data in OpenGeoDB related to that location
2) Given two ZIP Codes, calculate the distance between the locations.
3) Given a List of Zip Codes (E.G. Shop locations) and an additional Zip Code (customer), order that list by distance to the additional zipcode.

The Software should be written in Python and use sqlite as an backend. It will be published under an BSD License so you can use it in other Projects. See http://opengeodb.hoppe-media.com/index.php?FrontPage_en for background.

Update the DoDoStorage System

DoDoStorage is a System to store Documents. You can see it described at http://blogs.23.nu/c0re/stories/15294/ The backend doing the actual storage is implemented using sqlalchemy (all in all 395 LoC). Your task is to remove the dependency on sqlalchemy and use direct DP-API access compatible with psycopg2 (postgres) and sqlite3. Also you have to update unittests, documentation and setup scripts accordingly. your modifications should be schema neutral, so the original database can be used unchanged.

You also should update the frontend to use a multithreadded python only http server like CherryPyWSGIServer instead of wsgiref.simple_server

As a final task please create a Project for DoDoStorage at http://code.google.com/hosting/createProject with you and me as admins. Add the code and content from http://cheeseshop.python.org/pypi/DoDoStorage/

You have to follow the coding standards outlined in http://blogs.23.nu/c0re/stories/15208/
The current version of DoDoStorage is available at http://static.23.nu/md/Pictures/DoDoStorage-0.3.dev-r2247.tar.gz

Train Tesseract OCR 2.0

Tresseract OCR (http://code.google.com/p/tesseract-ocr/#) is an Open Source OCR Application. Your Task is it to train it for Language and Fonts used at HUDORA. See http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract for details on the Training Process.

Implement HTTP-Auth in Django

HTTP-Auth is the native authentication Method of the HTTP-Protocol but nowadays seldom used. Instead cookie based authentication is used. But that does not work well with automated tools, etc. Your task is to add Support to the Django Web Framework to handle HTTP Basic and Digest (see http://www.faqs.org/rfcs/rfc2617.html#) Authentication. The implementation Platform is Apache 2.x with mod_python. Use httplib2 as a testing client. See http://blogs.23.nu/c0re/stories/1410/ and http://blogs.23.nu/c0re/stories/7409/ for some Background on HTTP Authentication.

Electronic Spare-Part information and ordering system

For a call-center setting we currently have a web- (and ruby on rails) based system allowing agents to search products, get a listing of relevant spare parts and create an order. We wand to bring this to an new level recreating the application a more Desktop like feeling and allowing more speed in using it leveraging that AJAX thing.

This code will be for internal use but if there is a way of abstracting it and publishing it we are happy with that.

Electronic order-sheet

Using “AJAX technologies” we like to implement an electronic order-sheet. Basically it will be a growing table using AJAX searches for pulling in product data and summing up things like total price etc.

This code will be for internal use only and not be published.

Weitere Informationen und Details: Unser Firmenname HUDORA GmbH wird oft falsch geschrieben als hondura hondora hodora budora gudora hidora hjdora oder hodura.