Discussion:
[Erp5-dev] Proposal for CSV import/export enhancements
Łukasz Nowak
2007-10-16 11:22:18 UTC
Permalink
Hello,

While playing with production we are able to import BOMs and many
other useful information from various CAD/CAM/etc programs using CSV.

As CSV is quite...hm...undefined, there are many problems. Many
programs use different implementation of CSV - some of them use ";" as
separator some ",". Some use '"' as quote mark, some "'". Some use
UTF-8 encoding, some other ASCII, and so on. So, IMHO, importing of
CSV might be two-phase system. First one upload file, second one
choose some attributes and provide user special listbox, which will
"simulate" how will system understand provided CSV. Something similar
to OpenOffice CSV import/export dialog.

Exporting might be also enhanced, to support many CSV types.

I'm able to provide such dialogues, create basic unit tests, and make
such changes backward compatible. We need it and I think some other
implementation would be happy to have featured CSV import/export.

What do you think about it? Would you accept to create erp5_csv_enhanced
Business Template on which I would do my development and then, after
acceptation and review put it into erp5_core, or shall I do it in-house
and publicise work when done? Sending work efforts to mailing list is
quite, hm, problematic. It would end up with patches, to patches, to
patches, and so on...

Regards,
Luke
--
?ukasz Nowak R&D Ventis http://www.ventis.com.pl/
tel: +48 32 768 16 85 fax: +48 32 392 10 61
``Use the Source, Luke...''
I am only craftsman. Maybe good one, but only craftsman.
bartek
2007-10-16 11:45:52 UTC
Permalink
Post by Łukasz Nowak
Hello,
While playing with production we are able to import BOMs and many
other useful information from various CAD/CAM/etc programs using CSV.
As CSV is quite...hm...undefined, there are many problems. Many
programs use different implementation of CSV - some of them use ";" as
separator some ",". Some use '"' as quote mark, some "'". Some use
UTF-8 encoding, some other ASCII, and so on. So, IMHO, importing of
CSV might be two-phase system. First one upload file, second one
choose some attributes and provide user special listbox, which will
"simulate" how will system understand provided CSV. Something similar
to OpenOffice CSV import/export dialog.
Exporting might be also enhanced, to support many CSV types.
I'm able to provide such dialogues, create basic unit tests, and make
such changes backward compatible. We need it and I think some other
implementation would be happy to have featured CSV import/export.
What do you think about it? Would you accept to create erp5_csv_enhanced
Business Template on which I would do my development and then, after
acceptation and review put it into erp5_core, or shall I do it in-house
and publicise work when done? Sending work efforts to mailing list is
quite, hm, problematic. It would end up with patches, to patches, to
patches, and so on...
I think it is generally a good way to proceed with such extra or
community-made stuff. Not sure if the word "enhanced" is the most
appropriate, since in this case it is in fact an experimental or
development version which is going to be merged. Also, maybe it would
make it more convenient if we use prefix instead of suffix (like
"contrib_erp5_csv_style", or something similar), so that it is cleanly
separated from core bt's.


Bartek
Post by Łukasz Nowak
Regards,
Luke
--
"feelings affect productivity. (...) unhappy people write worse
software, and less of it."
Karl Fogel, "Producing Open Source Software"
Jean-Paul Smets
2007-10-16 20:10:36 UTC
Permalink
Hi,

I am not sure developing CSV stuff makes any sense today we have oood. I
would rather recommend using oood to import / export any kind of file
including CSV.

For example, what you describe here:

"First one upload file, second onechoose some attributes and provide
user special listbox, which will
"simulate" how will system understand provided CSV. Something similar
to OpenOffice CSV import/export dialog. "

is mostly implementend already using openffice format.

I do not understand the interest of developing the same thing for CSV
when it already exists for ODS file format and oood can do the
conversion quite well (with good support of all kinds of CSV formats and
conventions).

My recommendation would rather be to
1- document what already exists based on ODS
2- use it
3- extend it (with unit tests in particular)
4- extend it more to include oood in the loop and support CSV, XLS, etc.
5- contribute directly to existing bt5 after making sure existing unit
tests do not break
6- and at some point remove the current CSV support in ERP5

This is what we are planning to do and any help would be much welcome.

Now, it is possible that I missed something about CSV which does not fit
in the plan. Please let me know if something is missing.

Regards,

JPS.
Post by Łukasz Nowak
Hello,
While playing with production we are able to import BOMs and many
other useful information from various CAD/CAM/etc programs using CSV.
As CSV is quite...hm...undefined, there are many problems. Many
programs use different implementation of CSV - some of them use ";" as
separator some ",". Some use '"' as quote mark, some "'". Some use
UTF-8 encoding, some other ASCII, and so on. So, IMHO, importing of
CSV might be two-phase system. First one upload file, second one
choose some attributes and provide user special listbox, which will
"simulate" how will system understand provided CSV. Something similar
to OpenOffice CSV import/export dialog.
Exporting might be also enhanced, to support many CSV types.
I'm able to provide such dialogues, create basic unit tests, and make
such changes backward compatible. We need it and I think some other
implementation would be happy to have featured CSV import/export.
What do you think about it? Would you accept to create erp5_csv_enhanced
Business Template on which I would do my development and then, after
acceptation and review put it into erp5_core, or shall I do it in-house
and publicise work when done? Sending work efforts to mailing list is
quite, hm, problematic. It would end up with patches, to patches, to
patches, and so on...
Regards,
Luke
--
Jean-Paul Smets-Solanes, Nexedi CEO - Tel. +33(0)6 62 05 76 14
Nexedi: Consulting and Development of Libre / Open Source Software
http://www.nexedi.com
ERP5: Libre/ Open Source ERP Software for small and medium companies
http://www.erp5.org
Jérome Perrin
2007-10-17 10:02:29 UTC
Permalink
Post by Łukasz Nowak
Hello,
While playing with production we are able to import BOMs and many
other useful information from various CAD/CAM/etc programs using CSV.
As CSV is quite...hm...undefined, there are many problems. Many
programs use different implementation of CSV - some of them use ";" as
separator some ",". Some use '"' as quote mark, some "'". Some use
UTF-8 encoding, some other ASCII, and so on. So, IMHO, importing of
CSV might be two-phase system. First one upload file, second one
choose some attributes and provide user special listbox, which will
"simulate" how will system understand provided CSV. Something similar
to OpenOffice CSV import/export dialog.
Hi,

As you pointed out, despite looking simple, csv is in reality extremly
complex (you can look at python csv module implementation), that's one
more reason why we concentrate on importing from OpenOffice. It still
needs improvement, but here's what we have today:

- Import of categories from a spreadsheet (Import/Export action from
portal_categories). This uses oood to convert this input file, if for
instance you upload a .xls file. I haven't try with CSV, I don't know if
oood can convert from CSV.

- There is an interesting prototype on person module: "Import Persons
from OpenOffice Calc".
This tool propose to map openoffice column to document properties. As
far as I know it doesn't work for categories, and probably have other
problems, but it's IMHO a good start.

As far as I know this is the current status of OOo import in ERP5. If we
want better csv support, one way might be to add more csv to ods / ods
to csv support directly in oood, by "normalizing" csv in python using
csv module. This is just an idea, I haven't tried it at all, and I'm not
sure that information like "this column is numeric" can be kept with csv
only anyway ...

J?rome
Łukasz Nowak
2007-10-17 11:50:53 UTC
Permalink
Hello Jerome,

On 2007-10-17, 12:02:29
Post by Jérome Perrin
Post by Łukasz Nowak
Hello,
While playing with production we are able to import BOMs and many
other useful information from various CAD/CAM/etc programs using CSV.
As CSV is quite...hm...undefined, there are many problems. Many
programs use different implementation of CSV - some of them use ";"
as separator some ",". Some use '"' as quote mark, some "'". Some
use UTF-8 encoding, some other ASCII, and so on. So, IMHO,
importing of CSV might be two-phase system. First one upload file,
second one choose some attributes and provide user special listbox,
which will "simulate" how will system understand provided CSV.
Something similar to OpenOffice CSV import/export dialog.
Hi,
As you pointed out, despite looking simple, csv is in reality
extremly complex (you can look at python csv module implementation),
that's one more reason why we concentrate on importing from
- Import of categories from a spreadsheet (Import/Export action from
portal_categories). This uses oood to convert this input file, if for
instance you upload a .xls file. I haven't try with CSV, I don't know
if oood can convert from CSV.
- There is an interesting prototype on person module: "Import
Persons from OpenOffice Calc".
This tool propose to map openoffice column to document properties. As
far as I know it doesn't work for categories, and probably have other
problems, but it's IMHO a good start.
As far as I know this is the current status of OOo import in ERP5. If
we want better csv support, one way might be to add more csv to ods /
ods to csv support directly in oood, by "normalizing" csv in python
using csv module. This is just an idea, I haven't tried it at all,
and I'm not sure that information like "this column is numeric" can
be kept with csv only anyway ...
OK. Understood. I'd very like to help with improving another way of
importing data into ERP5 from external systems. But I do not have
starting point, working example, or anything for now. I _think_ I
*feel* general idea, but I might be wrong. I'll _try_ to reference to
unit tests, but any nice working example what is working right now
would speed up my learning of those aspect and my future contributions.

As for now I've prepared some "intelligence" (or stupidity ;) ) to CSV
related imports, those are attached as .tar. As going to have any user
interaction is quite hard (I cannot do it as I thought, and I'll wait
until work on using ODS will be ready) and it is possible to figure out
what is quote and separator when you have CSV file I done it. Please
treat is as proof of concept, but it will be used (and roughly tested)
in our environment. Data used for tests are real data with some
scramblification applied.

Regards,
Luke
--
?ukasz Nowak R&D Ventis http://www.ventis.com.pl/
tel: +48 32 768 16 85 fax: +48 32 392 10 61
``Use the Source, Luke...''
I am only craftsman. Maybe good one, but only craftsman.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: CSV-magic.tar.gz
Type: application/x-gzip
Size: 3525 bytes
Desc: not available
URL: <http://mail.tiolive.com/pipermail/erp5-dev/attachments/20071017/4da2e3f1/attachment.bin>
Jean-Paul Smets
2007-10-17 12:43:41 UTC
Permalink
Post by Jérome Perrin
Post by Łukasz Nowak
Hello,
While playing with production we are able to import BOMs and many
other useful information from various CAD/CAM/etc programs using CSV.
As CSV is quite...hm...undefined, there are many problems. Many
programs use different implementation of CSV - some of them use ";" as
separator some ",". Some use '"' as quote mark, some "'". Some use
UTF-8 encoding, some other ASCII, and so on. So, IMHO, importing of
CSV might be two-phase system. First one upload file, second one
choose some attributes and provide user special listbox, which will
"simulate" how will system understand provided CSV. Something similar
to OpenOffice CSV import/export dialog.
Hi,
As you pointed out, despite looking simple, csv is in reality extremly
complex (you can look at python csv module implementation), that's one
more reason why we concentrate on importing from OpenOffice. It still
- Import of categories from a spreadsheet (Import/Export action from
portal_categories). This uses oood to convert this input file, if for
instance you upload a .xls file. I haven't try with CSV, I don't know if
oood can convert from CSV.
From what I know, import from categories uses sxc file as input.
- There is an interesting prototype on person module: "Import Persons
from OpenOffice Calc".
This tool propose to map openoffice column to document properties. As
far as I know it doesn't work for categories, and probably have other
problems, but it's IMHO a good start.
It should work for categories (ie. to map values to categories). I saw
it work 1 year ago.
I would be quite easy to add a step of conversion to ODS/SXC file
Post by Jérome Perrin
As far as I know this is the current status of OOo import in ERP5. If we
want better csv support, one way might be to add more csv to ods / ods
to csv support directly in oood, by "normalizing" csv in python using
csv module. This is just an idea, I haven't tried it at all, and I'm not
sure that information like "this column is numeric" can be kept with csv
only anyway ...
I recommend to add a call to oood in order to convert input files before
processing to the import. This way, the import tools become universal
and support all formats.

Maybe you shoudl also point to ODS export skin which includes some
universal export capability. It is interesting because it shows that it
can be very easy to add the feature of conversion to import / export
with oood.

Regards,

JPS.
Post by Jérome Perrin
J?rome
_______________________________________________
Erp5-dev mailing list
Erp5-dev at erp5.org
http://mail.nexedi.com/mailman/listinfo/erp5-dev
--
Jean-Paul Smets-Solanes, Nexedi CEO - Tel. +33(0)6 62 05 76 14
Nexedi: Consulting and Development of Libre / Open Source Software
http://www.nexedi.com
ERP5: Libre/ Open Source ERP Software for small and medium companies
http://www.erp5.org
Jérome Perrin
2007-10-17 13:50:53 UTC
Permalink
Post by Jean-Paul Smets
Post by Jérome Perrin
- There is an interesting prototype on person module: "Import Persons
from OpenOffice Calc".
This tool propose to map openoffice column to document properties. As
far as I know it doesn't work for categories, and probably have other
problems, but it's IMHO a good start.
It should work for categories (ie. to map values to categories). I saw
it work 1 year ago.
You are right, this code calls a script named
ERP5Site_getCategoriesFullPath that searches the catalog to get the
category path from its reference or title.

J?rome
Jérome Perrin
2007-10-17 13:46:20 UTC
Permalink
Post by Jérome Perrin
Post by Łukasz Nowak
Hello,
While playing with production we are able to import BOMs and many
other useful information from various CAD/CAM/etc programs using CSV.
As CSV is quite...hm...undefined, there are many problems. Many
programs use different implementation of CSV - some of them use ";" as
separator some ",". Some use '"' as quote mark, some "'". Some use
UTF-8 encoding, some other ASCII, and so on. So, IMHO, importing of
CSV might be two-phase system. First one upload file, second one
choose some attributes and provide user special listbox, which will
"simulate" how will system understand provided CSV. Something similar
to OpenOffice CSV import/export dialog.
Hi,
As you pointed out, despite looking simple, csv is in reality extremly
complex (you can look at python csv module implementation), that's one
more reason why we concentrate on importing from OpenOffice. It still
- Import of categories from a spreadsheet (Import/Export action from
portal_categories). This uses oood to convert this input file, if for
instance you upload a .xls file. I haven't try with CSV, I don't know if
oood can convert from CSV.
I can give more precise information: This is done in
CategoryTool_importCategoryFile script from erp5_core business template.
If the user provides a file which is not in open office format, it
creates a newTempOOoDocument and call convert('sxc') which uses oood to
get the file content converted in sxc format.
Post by Jérome Perrin
- There is an interesting prototype on person module: "Import Persons
from OpenOffice Calc".
This tool propose to map openoffice column to document properties.
This action is also an object_exchange action on Person Module portal
type. The script is ERP5Site_importObjectFromOOo from erp5_base. With
this tool, you update the spreadsheet once, then the dialog contains a
listbox where you see your spreadsheet columns, and you can choose which
document properties to use for which spreadsheet column.
Once you specified this in the listbox, you upload spreadsheet again and
it will create objects, one per line in your spreadsheet, and edit
objects according to what you defined in the listbox.

J?rome
Brice LEROY
2007-10-18 14:39:49 UTC
Permalink
Hello,
To continue about OOorg import, I just finish to import with this tool more
than 20000 part (Sale Order, Person...). It's work's find under 5000
treatment, after, xml parser eat all RAM (4Gb) and stop working ! I know that
this problem is not due to ERP5 special stuff and I don't thing I use OOorg
import in the right way (kool to import frequently small document, but not
made to import mass document in 1 time :) ). For Information I used a script
to transforme an CSV to dict format and launch file with an external method
(Ext method has been destroy since imports are done).

About the use of Category, I haven't find a way to use it correctly : I used
an OOorg find method to correct field before import :( ... The most important
think for me is the relation between object (ie. source,destination in
sale_order). It could be interresting to indicate a given portal_type list
research for a given property (I do it but hardcoded), build a dictionnary of
possible response at the beginning of import like :

corresponding_dict['destination'] = { title: relativeUrl,... }

and find correct object with :

if corresponding_dict.has_key(property):
value = corresponding_dict[property].get(value, value)
Post by Jérome Perrin
Post by Łukasz Nowak
Hello,
While playing with production we are able to import BOMs and many
other useful information from various CAD/CAM/etc programs using CSV.
As CSV is quite...hm...undefined, there are many problems. Many
programs use different implementation of CSV - some of them use ";" as
separator some ",". Some use '"' as quote mark, some "'". Some use
UTF-8 encoding, some other ASCII, and so on. So, IMHO, importing of
CSV might be two-phase system. First one upload file, second one
choose some attributes and provide user special listbox, which will
"simulate" how will system understand provided CSV. Something similar
to OpenOffice CSV import/export dialog.
Hi,
As you pointed out, despite looking simple, csv is in reality extremly
complex (you can look at python csv module implementation), that's one
more reason why we concentrate on importing from OpenOffice. It still
- Import of categories from a spreadsheet (Import/Export action from
portal_categories). This uses oood to convert this input file, if for
instance you upload a .xls file. I haven't try with CSV, I don't know if
oood can convert from CSV.
- There is an interesting prototype on person module: "Import Persons
from OpenOffice Calc".
This tool propose to map openoffice column to document properties. As
far as I know it doesn't work for categories, and probably have other
problems, but it's IMHO a good start.
As far as I know this is the current status of OOo import in ERP5. If we
want better csv support, one way might be to add more csv to ods / ods
to csv support directly in oood, by "normalizing" csv in python using
csv module. This is just an idea, I haven't tried it at all, and I'm not
sure that information like "this column is numeric" can be kept with csv
only anyway ...
J?rome
_______________________________________________
Erp5-dev mailing list
Erp5-dev at erp5.org
http://mail.nexedi.com/mailman/listinfo/erp5-dev
--
Brice LEROY
CWD Sellier / Sellerie de Nontron
Service Informatique
tel : 05 53 60 72 70
fax : 05 53 60 72 79
Loading...